{"id":81459,"date":"2018-11-02T16:52:16","date_gmt":"2018-11-02T16:52:16","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=81459"},"modified":"2022-04-24T21:08:32","modified_gmt":"2022-04-24T21:08:32","slug":"creating-calculated-columns-using-dax","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/bi-sql-server\/creating-calculated-columns-using-dax\/","title":{"rendered":"Creating Calculated Columns Using DAX"},"content":{"rendered":"<p><strong>The series so far:<\/strong><\/p>\n<ol>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/creating-calculated-columns-using-dax\/\">Creating Calculated Columns Using DAX<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/creating-measures-using-dax\/\">Creating Measures Using DAX<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/using-the-dax-calculate-and-values-functions\/\">Using the DAX Calculate and Values Functions<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/using-the-filter-function-in-dax\/\">Using the FILTER Function in DAX<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/cracking-dax-the-earlier-and-rankx-functions\/\">Cracking DAX\u00a0\u2013 the EARLIER and RANKX Functions<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/using-calendars-and-dates-in-power-bi\/\">Using Calendars and Dates in Power BI<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/creating-time-intelligence-functions-in-dax\/\"><span>Creating Time-Intelligence Functions in DAX<\/span><\/a><\/li>\n<\/ol>\n\n<p>DAX is Microsoft\u2019s new(ish) language which allows you to return results from data stored using the xVelocity database engine, which, unlike for most databases, stores data in columns rather than rows. You can program in DAX within Power BI (Microsoft\u2019s flagship BI tool), PowerPivot (an Excel add-in which allows you to create pivot tables based on multiple tables) and Analysis Services Tabular Model (the successor to SSAS Multi-Dimensional, which allows you to share data models and implement security).<\/p>\n<p>As the demand for these technologies increases, I have been teaching these topics (<a href=\"https:\/\/www.wiseowl.co.uk\/courses\/power-bi-desktop.htm\">Power BI<\/a>, <a href=\"https:\/\/www.wiseowl.co.uk\/courses\/excel-data-analysis.htm\">PowerPivot<\/a> and <a href=\"https:\/\/www.wiseowl.co.uk\/courses\/ssas-tabular-analysis-services.htm\">SSAS Tabular<\/a>, as well as in <a href=\"https:\/\/www.wiseowl.co.uk\/courses\/dax.htm\">DAX<\/a>) to countless students in the UK. In the hopes that I can reach even more students, I decided to write this series of articles for the great readers of Simple-Talk.<\/p>\n<p>This article shows how to create a calculated column using DAX. If you\u2019re not sure what I mean by this, here\u2019s an example from Power BI:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"637\" height=\"171\" class=\"wp-image-81513\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-49.png\" \/><\/p>\n<p>I\u2019ve used Power BI for all my examples, but the methods and formulae used would be identical in PowerPivot or SSAS Tabular.<\/p>\n<h2>The Underlying Database for this Series of Articles<\/h2>\n<p>The examples in this article are based on this simple database, containing sales of fluffy toys:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"964\" height=\"265\" class=\"wp-image-81514\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-50.png\" \/><\/p>\n<p>You can follow all the examples on your computer \u2013 just <a href=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/10\/Simple-Toys.xlsx\">download this Excel workbook<\/a> containing these four tables or run <a href=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/10\/Generate-simple-toy-database.sql\">this SQL Server script<\/a> to generate the database. You\u2019ll then need to import the tables into a Power BI report, PowerPivot data model, or SSAS Tabular model.<\/p>\n<h2>Creating a Basic Calculated Column<\/h2>\n<p>Let\u2019s start with the basics. Suppose you want to calculate the value of each sale in the database by multiplying the price times the quantity. For the first item in the <em>Sales<\/em> table, this should give 9.49 since someone bought 1 item on 10<sup>th<\/sup> February 2017, at a price of 9.49.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"597\" height=\"117\" class=\"wp-image-81515\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-51.png\" \/><\/p>\n<p>To do this, create a formula which multiplies the price by the quantity for each row. Start by adding a column. Here\u2019s how to do this in Power BI:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"338\" height=\"373\" class=\"wp-image-81516\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-52.png\" \/><\/p>\n<p>Now rename the column. Call it <em>Sales value. <\/em>You don\u2019t need to be afraid of using spaces in column names in Power BI; they work perfectly:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"261\" height=\"90\" class=\"wp-image-81517\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-53.png\" \/><\/p>\n<p>You can now either click on the first column to which you want to refer or type in a square bracket ( [ ) symbol to bring up a list of columns:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"360\" height=\"180\" class=\"wp-image-81518\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-54.png\" \/><\/p>\n<p>It\u2019s enough to type in a single <em>P<\/em> in this instance:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"358\" height=\"100\" class=\"wp-image-81519\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-55.png\" \/><\/p>\n<p>You can now press the TAB key to add the selected item into your formula, then continue typing to complete your formula:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"594\" height=\"133\" class=\"wp-image-81520\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-56.png\" \/><\/p>\n<h2>Alternatives to Calculated Columns<\/h2>\n<p>Calculated columns aren\u2019t the only way to show the sales value for each transaction. There are at least three possible alternative ways to get at the sales value for each row in the example. Firstly, you could add the column to the underlying data source, for example, by creating a view in SQL like the one below:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"505\" height=\"486\" class=\"wp-image-81521\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-57.png\" \/><\/p>\n<p>A second way to avoid using calculated columns would be to do the calculation using the M formula language in the Query Editor (for SSAS Tabular this is only possible for SQL Server 2017 and later):<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"246\" height=\"178\" class=\"wp-image-81522\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-58.png\" \/><\/p>\n<p>And a third way is not to add the column at all, but to create it on the fly in a measure. Here\u2019s an example of a measure to do this. I\u2019ll cover measures in a later article in this series, but the comments should make it reasonably obvious what this is doing:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"427\" height=\"236\" class=\"wp-image-81523\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-59.png\" \/><\/p>\n<p>The obvious question is which of these three methods is the best one? I think it\u2019s easiest to create the calculated column in DAX, but it will slow up processing data and eat up memory. How so? When you refresh your data in a data model, Power BI, PowerPivot or SSAS Tabular divides the process (forgive the pun) into two steps:<\/p>\n<ol>\n<li>Processing &#8212; loading a fresh copy of the data from the underlying data source into the data model.<\/li>\n<li>Recalculating &#8212; updating all the relationships and hierarchies in your model and recalculating all the calculated columns.<\/li>\n<\/ol>\n<p>The more calculated columns you have, the slower processing may be. The column will also have a significant effect on the memory you use since it has much higher cardinality than either of the two columns it references. Here are the values for the price, quantity and sales amount columns for every sales transaction in the database:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"358\" height=\"670\" class=\"wp-image-81524\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-60.png\" \/><\/p>\n<p>The number of unique values in the three columns are as follows:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"209\" height=\"109\" class=\"wp-image-81525\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-61.png\" \/><\/p>\n<p>All programs using DAX store information in columns, not rows, an important point which I\u2019ll keep coming back to in this series of articles. When importing the data, Power BI will construct a dictionary of unique values for each column, and the above table shows that the dictionary for the sales value column will occupy more space than the dictionary for the other two columns put together. This won\u2019t be a problem with 25 rows, or even with 25,000, but with 25 million or even 25 billion rows, it might start using up valuable memory. Weighed against this is the fact that it will be convenient to have access to a ready-calculated sales column.<\/p>\n<p>If you\u2019re interested in how DAX stores data in columns I\u2019ve included some more detailed examples at the end of this article.<\/p>\n<h2>Referring to Table Names in DAX Formulae<\/h2>\n<p>In general, it\u2019s best practice in DAX to refer to columns using the <code>TableName[ColumnName]<\/code> syntax \u2013 so the formula would become:<\/p>\n<pre class=\"theme:vs2012-simple-talk lang:c# decode:true\">Sales value = Sales[Price] * Sales[Quantity]<\/pre>\n<p>It\u2019s helpful sometimes to type in a single apostrophe character to bring up a list of all the table names in a data model:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"397\" height=\"199\" class=\"wp-image-81526\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-62.png\" \/><\/p>\n<p>The apostrophes are essential for any table which has a name which is also a reserved word in Power BI, PowerPivot or SSAS Tabular. The following calculated column wouldn\u2019t work without the apostrophes since <code>Product<\/code> is a reserved word in Power BI:<\/p>\n<pre class=\"theme:vs2012-simple-talk lang:c# decode:true\">Discounted list price = 0.90 * 'Product'[ListPrice]<\/pre>\n<h2>Using Different DAX Editors<\/h2>\n<p>The Power BI, PowerPivot, and SSAS Tabular editors have all improved greatly over the years and versions (they needed to), but they can still be annoying at times. I have two particular bugbears:<\/p>\n<ol>\n<li>To put a carriage return into a formula, you need to hold down the SHIFT or ALT key and press ENTER. If you forget to hold down the SHIFT or ALT key, Power BI tries to validate the formula you\u2019re editing, which invariably fails and wastes precious seconds of your life.<\/li>\n<li>There\u2019s no easy way that I\u2019ve ever found in Power BI to make the font bigger. You can click on the chevron shown below to make the editing window single-line or multi-line, but you can\u2019t use CTRL and the mouse wheel to zoom in and out:<\/li>\n<\/ol>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"798\" height=\"357\" class=\"wp-image-81527\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-63.png\" \/><\/p>\n<p>Because of these inconveniences, I often write my DAX formula in DAX Studio, a standalone editor which you can download <a href=\"http:\/\/daxstudio.org\/\">here<\/a>. It\u2019s reasonably intuitive to use as the main thing to do when you run the application is to connect to a data model:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"593\" height=\"309\" class=\"wp-image-81528\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-64.png\" \/><\/p>\n<p>You can choose the Power BI or SSAS Tabular model you want to connect to, provided that you have a Power BI file or SSAS Tabular model open. To use DAX with PowerPivot, you must install the DAX Studio add-in, and then run this from within Excel.<\/p>\n<p>You can now type in your DAX formulae, including dragging field or table names from the metadata shown on the left into your formula:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"503\" height=\"311\" class=\"wp-image-81529\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-65.png\" \/><\/p>\n<p>Eventually, you\u2019ll have to copy this formula into Power BI, but at least it gives you a nice editing environment.<\/p>\n<h2>Testing Conditions Using IF<\/h2>\n<p>DAX contains the same <code>IF<\/code> function as Excel, which allows you to test whether a condition is true or not, and return different values in either case:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"608\" height=\"94\" class=\"wp-image-81530\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-66.png\" \/><\/p>\n<p>Suppose that you want to deliver a verdict on each sale \u2013 if it costs more than 10 units (let\u2019s say they\u2019re dollars, for the sake of argument), you\u2019ll call it <em>expensive<\/em>. Otherwise you\u2019ll call it <em>cheap<\/em>. Here\u2019s a formula which would do this:<\/p>\n<pre class=\"theme:vs2012-simple-talk lang:c# decode:true \">Verdict = IF(\r\n\r\n    \/\/ if the sales amount is more than 10 ...\r\n    [Sales value] &gt; 10,\r\n    \r\n    \/\/ ... then show EXPENSIVE ...\r\n    \"Expensive\",\r\n    \r\n    \/\/ ... otherwise, show CHEAP\r\n    \"Cheap\"\r\n)\r\n<\/pre>\n<p>This new column would correctly distinguish between sales of more than or less than 10 dollars:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"255\" height=\"129\" class=\"wp-image-81531\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-67.png\" \/><\/p>\n<h2>Testing Conditions Using SWITCH<\/h2>\n<p>It\u2019s time now to complicate things \u2013 what happens if you want to show a verdict on each sale with the following criteria?<\/p>\n<ul>\n<li>Up to 5 dollars is <em>Cheap<\/em>.<\/li>\n<li>Between 5 and 10 dollars is <em>Cheapish<\/em>.<\/li>\n<li>From 10 to 15 dollars is <em>Middling<\/em>.<\/li>\n<li>Anything above 15 dollars is <em>Expensive<\/em>.<\/li>\n<\/ul>\n<p>You could do this by nesting <code>IF<\/code> functions within each other, but the results would be difficult to read. A better solution is to use the <code>SWITCH<\/code> function, a version of which also exists in the latest versions of Excel, although many people aren\u2019t aware of this.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"642\" height=\"77\" class=\"wp-image-81532\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-68.png\" \/><\/p>\n<p>The first argument (the first bit of information the function asks for) is an expression. This can be anything, but typically is <code>True()<\/code>, showing that it will keep trying to evaluate possibilities until it finds something which is true. Once true is found, it will immediately stop. This is easier to understand if you look at the suggested formula:<\/p>\n<pre class=\"theme:vs2012-simple-talk lang:c# decode:true\">Better Verdict = SWITCH(\r\n\r\n    \/\/ we're looking for something which is true\r\n    TRUE(),\r\n    \r\n    \/\/ start at the bottom - is this item's sales less \r\n    \/\/ than or equal to 5?  If so, it's cheap\r\n    [Sales value] &lt;= 5,\r\n    \"Cheap\",\r\n    \r\n    \/\/ if we reach here, it wasn't less than 5, so \r\n    \/\/ maybe it was greater than 5 but less than 10?\r\n    [Sales value] &lt; 10,\r\n    \"Cheapish\",\r\n    \r\n    \/\/ OK, that failed too - maybe it was between 10\r\n    \/\/ and 15?\r\n    [Sales value] &lt; 15,\r\n    \"Middling\",\r\n    \r\n    \/\/ if we get here, the sales value must be 15 or more\r\n    \"Expensive\"\r\n)\r\n<\/pre>\n<p>You can have as many pairs of conditions and values as you like. You don\u2019t have to have the ELSE value at the end, although, it\u2019s highly recommended, as you may have inadvertently omitted some values from the tests.<\/p>\n<h2>Linking Tables Together Using RELATED<\/h2>\n<p>So far Excel users are probably wondering what\u2019s different about DAX. You are about to find out! Suppose that you import all the worksheets in the supplied Excel workbook apart from the last one called <em>TempSales<\/em>:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"194\" height=\"209\" class=\"wp-image-81533\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-69.png\" \/><\/p>\n<p>Your data model should now look like this:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"979\" height=\"354\" class=\"wp-image-81534\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-70.png\" \/><\/p>\n<p>This makes for a messy interface, as a user must figure out from which table to get fields:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"227\" height=\"254\" class=\"wp-image-81535\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-71.png\" \/><\/p>\n<p>It would be better for the user if you could combine the fields into a single table like this:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"210\" height=\"325\" class=\"wp-image-81536\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-72.png\" \/><\/p>\n<p>There are two steps to accomplish this. The first is to hide tables and columns from report view, which you can do by right-clicking on them. This doesn\u2019t remove them from the data, however.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"988\" height=\"365\" class=\"wp-image-81537\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-73.png\" \/><\/p>\n<p>The other part of the magic is to add calculated columns to show the city, country, and product for each sale:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"889\" height=\"142\" class=\"wp-image-81538\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-74.png\" \/><\/p>\n<p>To do this, use the <code>RELATED<\/code> function:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"407\" height=\"75\" class=\"wp-image-81539\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-75.png\" \/><\/p>\n<p>The <code>RELATED<\/code> function will look up the value of any other column in any other table, providing that there is a direct path between the two tables and that the question makes sense. For example, you could look up for any sale the name of the country in which it took place because each sale belongs to a single city, and each city belongs to a single country. It doesn\u2019t make sense to go the other direction, to look up a sale for each country, because this wouldn\u2019t be uniquely defined. The question doesn\u2019t make sense.<\/p>\n<p>Another way to look at this, is to check for each relationship whether it is one-to-many or many-to-one. I\u2019ll cover the special case of one-to-one relationships later in this series of articles.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"964\" height=\"265\" class=\"wp-image-81540\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-76.png\" \/><\/p>\n<p>Using the above relationship diagram, you can see that one country has many cities and one city has many sales, so it\u2019s legitimate to show the country where each sales transaction took place. The <code>RELATED<\/code> function is like the <code>VLOOKUP<\/code> function in Excel, with the difference being that you can daisy-chain between tables provided that they are joined by the correct type of relationship.<\/p>\n<p>Here, for example, is the formula to show the country for each sale:<\/p>\n<pre class=\"theme:vs2012-simple-talk lang:c# decode:true\">Country = RELATED(\r\n\r\n    \/\/ look up the country name from\r\n    \/\/ the country table (indirectly \r\n    \/\/ linked via the city table)\r\n    Country[CountryName]\r\n)\r\n<\/pre>\n<p>And here\u2019s what the final sales table might look like after adding the <em>Product<\/em> and <em>City<\/em> columns as well with the same technique:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"895\" height=\"427\" class=\"wp-image-81541\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-77.png\" \/><\/p>\n<p>Note that there\u2019s a problem with sales in Cape Town, which don\u2019t seem to belong to any country &#8211; I\u2019ll show how to resolve this further down in this article.<\/p>\n<h2>Linking Tables Together Using RELATEDTABLE<\/h2>\n<p>The previous calculated columns show the parent for any child; the <code>RELATEDTABLE<\/code> function allows you to show the children for any parent. The difference is that whereas the <code>RELATED<\/code> function will always return a single value, the <code>RELATEDTABLE<\/code> function will always return a table of data.<\/p>\n<p>Here\u2019s an example to show, for each country, how many sales took place within it:<\/p>\n<pre class=\"theme:vs2012-simple-talk lang:c# decode:true\">Number of sales = COUNTROWS(\r\n    \r\n    \/\/ count how many rows there are in the \r\n    \/\/ table of sales for this country\r\n    RELATEDTABLE(Sales)\r\n)\r\n<\/pre>\n<p>The formula would return these values in this example:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"347\" height=\"156\" class=\"wp-image-81542\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-78.png\" \/><\/p>\n<h2>Dealing with Blanks<\/h2>\n<p>If you\u2019re a SQL programmer, you\u2019ll be familiar with the perils of including null values in your formula since any formula which has a null value as one of its inputs tends to spit out a null value as its result. The equivalent to a null value in DAX is <em>BLANK<\/em>. You can test whether a column equals blank in one of two ways \u2013 either by comparing it directly with <em>BLANK<\/em> or by using the <code>ISBLANK<\/code> function.<\/p>\n<p>Review the example showing the country in which each sale took place. The results show a blank next to any sale in Cape Town:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"792\" height=\"299\" class=\"wp-image-81543\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-79.png\" \/><\/p>\n<p>Cape Town belongs to country number 6:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"325\" height=\"307\" class=\"wp-image-81544\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-80.png\" \/><\/p>\n<p>There is no country number 6 in the database which has caused the problem with Cape Town:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"347\" height=\"153\" class=\"wp-image-81545\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-81.png\" \/><\/p>\n<p>It\u2019s not very good database design practice, but it is a very convenient way to illustrate how blanks work. The problem is that the <code>RELATED<\/code> function looks across to the country table and returns the country in which each sale takes place. For sales in Cape Town, the function can\u2019t find a related country, and so just shows a blank.<\/p>\n<p>This looks suspicious, so change it to show the message \u201cCountry not found.\u201d One way to do this is by testing to see whether the value returned from the <code>RELATED<\/code> function is a blank:<\/p>\n<pre class=\"theme:vs2012-simple-talk lang:c# decode:true\">Country =  IF(\r\n\r\n        \/\/ if there's no country found for this sale ...\r\n        RELATED(Country[CountryName]) = BLANK(),\r\n        \r\n        \/\/ ... show blank, or otherwise ...\r\n        \"Country not found\",\r\n        \r\n        \/\/ ... show the country found\r\n        RELATED(Country[CountryName])\r\n    )\r\n<\/pre>\n<p>Alternatively, you could use the ISBLANK function:<\/p>\n<pre class=\"theme:vs2012-simple-talk lang:c# decode:true \">CountryISBLANK = IF(\r\n\r\n        \/\/ if there's no country found for this sale ...\r\n        ISBLANK(RELATED(Country[CountryName])),\r\n        \/\/ ... show blank, or otherwise ...\r\n        \"Country not found\",\r\n        \r\n        \/\/ ... show the country found\r\n        RELATED(Country[CountryName])\r\n    )\r\n<\/pre>\n<p>In either case, the results are much better!<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"921\" height=\"121\" class=\"wp-image-81546\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-82.png\" \/><\/p>\n<p>Perhaps an even better way to solve this problem is to create an intermediate column which you can then hide from view:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"606\" height=\"157\" class=\"wp-image-81547\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-83.png\" \/><\/p>\n<p>This will make the formula simpler since you won\u2019t need to repeat the RELATED function. The formula for the intermediate country column above could be:<\/p>\n<pre class=\"theme:vs2012-simple-talk lang:c# decode:true\">Intermediate country = \r\n        \r\n        \/\/ ... show the country found (may be blank)\r\n        RELATED(Country[CountryName])\r\n<\/pre>\n<p>The formula for the final country column could then refer to the intermediate column:<\/p>\n<pre class=\"lang:c# decode:true \">Country = IF(\r\n\r\n    \/\/ if the country returned is blank ...\r\n    ISBLANK([Intermediate country]),\r\n    \r\n    \/\/ ... show suitable message ...\r\n    \"Country not found\",\r\n    \r\n    \/\/ ... otherwise, show country name\r\n    [Intermediate country]\r\n)\r\n<\/pre>\n<h2>Dealing with Errors<\/h2>\n<p>The files referenced at the top of this article also include a separate table called <em>TempSales<\/em> which looks like this:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"542\" height=\"304\" class=\"wp-image-81548\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-84.png\" \/><\/p>\n<p>For some reason, Los Angeles has managed to record sales despite having no stores. How this happened is beyond the scope of this article (although it looks suspiciously like an attempt to create an example illustrating error-handling in DAX).<\/p>\n<p>If you create a calculated column showing sales-per-store within this table, you will get an error:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"638\" height=\"331\" class=\"wp-image-81549\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-85.png\" \/><\/p>\n<p>The reason there\u2019s an infinity sign next to Los Angeles is that this row contains a divide-by-zero error. If you divide total sales for Los Angeles of 4.5 by the number of stores, 0, you get infinity. There are several possible solutions to this: don\u2019t let the error happen in the first place, let it happen but trap it, or specifically watch out for divide-by-zero errors and handle them where they occur.<\/p>\n<p>You could use the <code>IF<\/code> function to test if the denominator is zero, therefore, preventing the error in the first place. For those who have forgotten their schooldays:<\/p>\n<ul>\n<li>The numerator is the number on the top of a fraction; and<\/li>\n<li>The denominator is the number at the bottom.<\/li>\n<\/ul>\n<p>In the equation 22\/7, 22 is the numerator and 7 the denominator.<\/p>\n<p>Using this method would give a calculated column like this:<\/p>\n<pre class=\"theme:vs2012-simple-talk lang:c# decode:true\">Sales per store = IF(\r\n\r\n    \/\/ if no stores, show blank\r\n    [NumberStores] = 0,\r\n    \r\n    BLANK(),\r\n    \r\n    \/\/ otherwise, do the division\r\n    [TotalSales] \/ [NumberStores]\r\n)\r\n<\/pre>\n<p>This method would give the following results (as would the other two methods used below):<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"284\" class=\"wp-image-81550\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-86.png\" \/><\/p>\n<p>The second method for solving this problem is to use the <code>IFERROR<\/code> function, which works in the same way as it does in Excel. The first argument is the thing which may contain an error, and the second one is what you want to display if it does:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"672\" height=\"94\" class=\"wp-image-81551\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-87.png\" \/><\/p>\n<p>For this case, you could enter the following DAX formula for the calculated column:<\/p>\n<pre class=\"theme:vs2012-simple-talk lang:c# decode:true\">Error-trapping = IFERROR(\r\n\r\n    \/\/ try dividing sales by number of stores\r\n    [TotalSales]\/[NumberStores],\r\n    \r\n    \/\/ if this fails, show blank\r\n    BLANK()\r\n)\r\n<\/pre>\n<p>The final method you could use is the <code>DIVIDE<\/code> function, which, as its name suggests, divides one number by another. The difference is that it automatically returns a value that you specify if this division returns a divide-by-zero error. The syntax of the function is:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"643\" height=\"71\" class=\"wp-image-81552\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-88.png\" \/><\/p>\n<p>In this case, you could use the following formula:<\/p>\n<pre class=\"theme:vs2012-simple-talk lang:c# decode:true\">Divide by zero trap = DIVIDE(\r\n\r\n    \/\/ divide the sales by the number of stores\r\n    [TotalSales],\r\n    [NumberStores]\r\n\r\n    \/\/ could specify what to return if denominator \r\n    \/\/ is zero, but this will default to blank,\r\n    \/\/ which is what we want\r\n)\r\n<\/pre>\n<p>Which of the three methods is best? I think a purist would answer the first one because it prevents the error from happening in the first place, but I\u2019d go with the last one. If you\u2019re going to trap for an error, it\u2019s good to be as specific as possible about the nature of the error you\u2019re trapping; <code>IFERROR<\/code> is a bit vague.<\/p>\n<h2>Finally: How DAX Stores Data<\/h2>\n<p>To end this article, I\u2019d like to expand a bit on how DAX stores data in columns, not rows. Understanding this will help you make sense of DAX formulae when they get more complicated (which they certainly will!).<\/p>\n<p>To explain column storage, I\u2019m going to take a quick digression into image compression. As my youngest daughter says: \u201cBear with!\u201d.<\/p>\n<p>Here\u2019s a picture of a house (and a very nice one it is too):<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"565\" height=\"356\" class=\"wp-image-81553\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-89.png\" \/><\/p>\n<p>Notice how the sky is the same shade of blue nearly everywhere. To store this image, you wouldn\u2019t have to store every pixel; instead, you could use an image compression algorithm to condense everything down, using the fact that lots of adjacent pixels share the same colour. An image which compresses down to a small size tends to be one which would be hard to do as a jigsaw!<\/p>\n<p>The above example is a bit complicated, so here\u2019s an easier one:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"425\" height=\"240\" class=\"wp-image-81554\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-90.png\" \/><\/p>\n<p>Not quite as nice a house, but much easier to use for an explanation! Here\u2019s how to store this as a compressed image. First, build a dictionary of colours:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"168\" height=\"118\" class=\"wp-image-81555\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-91.png\" \/><\/p>\n<p>Now (starting from the top left) create a table showing how the colours are used:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"252\" height=\"223\" class=\"wp-image-81556\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-92.png\" \/><\/p>\n<p>By my calculation, this table will have 34 rows. Instead of storing 100 different cell\/colour combinations, I\u2019m storing three colours and 34 rows, reducing the size of the image by a factor of three. Real image compression algorithms are much more sophisticated than this, but the principle is the same.<\/p>\n<p>What has all this got to do with column storage? Well, the principle is identical. Here\u2019s what you might think the sales table would look like when storing each row as a separate entity:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"413\" height=\"206\" class=\"wp-image-81557\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-93.png\" \/><\/p>\n<p>Here\u2019s what the table actually looks like to DAX. Each column is stored separately, although not necessarily using different colours!<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"403\" height=\"286\" class=\"wp-image-81558\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-94.png\" \/><\/p>\n<p>The engine which stores data for PowerPivot, Power BI and SSAS Tabular is called the <em>xVelocity<\/em> engine (at least, it has been called that since SQL Server 2012). This database engine would store each column above separately. To show how this works, consider the <strong>ProductId<\/strong> column, which starts like this:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"80\" height=\"623\" class=\"wp-image-81559\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-95.png\" \/><\/p>\n<p>The first thing the database engine would do is to build a dictionary of unique values:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"231\" height=\"97\" class=\"wp-image-81560\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-96.png\" \/><\/p>\n<p>There are only three rows because, although there are 25 sales, these are all for one or another of the same three products. The engine would then store for each product its dictionary entry.<\/p>\n<p>This storage method allows the xVelocity engine to compress data so that it takes up less memory. The storage method also allows DAX to access all the values in a column more quickly. It\u2019s one reason why aggregating data over a column is so quick in DAX: all the figures you\u2019re aggregating are stored in the same place. It\u2019s worth bearing all of this in mind when you\u2019re writing DAX formulae and reading the rest of this series.<\/p>\n<h2>A Footnote: Efficient Storage<\/h2>\n<p>One of the consequences of the above storage algorithm is that the database engine will store columns with low granularity (i.e., with lots of repeated values) much more efficiently. In the diagram below, I\u2019ve changed the colours to reflect how expensive each column is to store:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"403\" height=\"196\" class=\"wp-image-81561\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/11\/word-image-97.png\" \/><\/p>\n<p>The <em>SalesId<\/em> column is the most expensive item to store because it has no duplicate values at all. It\u2019s the primary key in the <em>Sales<\/em> table, so that\u2019s unique by definition. The best thing to do to the data model would be to delete this column since it\u2019s not used in any reports, and it\u2019s not needed for any of the relationships.<\/p>\n<p>Unfortunately, you probably do need all the other columns in this table. Product id and city id fields are needed to link tables together. The price and quantity are needed to create measures, and the sales date are used to report figures by month, quarter and year.<\/p>\n<h2>Conclusion<\/h2>\n<p>This article has shown that you can create formulae to calculate columns within a table in Power BI, PowerPivot or SSAS Tabular. Although these formulae use the DAX language, they look remarkably similar to formulae used within Excel. You\u2019ve also seen that DAX stores data in columns rather than rows, which has important implications. Every calculated column that you create will need to be stored separately in memory. The next article in this series will look at the flip-side to calculated columns \u2013 measures \u2013 and explain the two most important concepts in DAX, which are row context and filter context.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this article, Andy Brown shares his knowledge of how to create calculated columns in tables, using Power BI as the host software (all of the formulae shown would work equally well in PowerPivot or SSAS Tabular).  This article is designed to be the first in a series of articles which will show you how to master programming in DAX. &hellip;<\/p>\n","protected":false},"author":9783,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143528],"tags":[],"coauthors":[6782],"class_list":["post-81459","post","type-post","status-publish","format-standard","hentry","category-bi-sql-server"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/81459","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/9783"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=81459"}],"version-history":[{"count":22,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/81459\/revisions"}],"predecessor-version":[{"id":82625,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/81459\/revisions\/82625"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=81459"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=81459"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=81459"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=81459"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}