{"id":68154,"date":"2016-09-16T13:33:29","date_gmt":"2016-09-16T13:33:29","guid":{"rendered":"https:\/\/www.simple-talk.com\/?p=68154"},"modified":"2021-09-15T13:22:23","modified_gmt":"2021-09-15T13:22:23","slug":"power-bi-data-modelling","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/bi-sql-server\/power-bi-data-modelling\/","title":{"rendered":"Power BI for Data Modelling"},"content":{"rendered":"<p>Very few SQL Server practitioners have managed to avoid hearing of Power BI. Some have even used it to create impressive dashboards bulging with charts and other visuals. Even after the initial excitement of the \u2018unboxing\u2019 has faded, many of us have (occasionally grudgingly) admitted that it is a powerful product that certainly seems to deliver what it promises.<\/p>\n<p>There is, however, a use for Power BI that is far removed from the graphical bells and whistles, but which could prove to be of most value to the database professional. This is the use of Power BI to model and shape data.<\/p>\n<h2>Why Shape Data?<\/h2>\n<p>The classic self-service BI approach usually follows the approach that is outlined in Figure 1, below:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2016\/09\/word-image-377.png\" width=\"562\" height=\"91\" \/><\/p>\n<p class=\"caption\">Figure 1.<\/p>\n<p>This is the solution that most users currently implement. You find data sources, connect to them, import what you need and then add any required calculations before creating dashboards. This is exactly what the product was designed for and it usually works flawlessly with an impressive range of data sources. Equally impressive is the power of DAX in its latest incarnation to extend a data model with complex additional metrics. Finally the data modelling capabilities allow you to add hierarchies and define KPIs.<\/p>\n<p>However this approach assumes that the source data is comprehensible to users, and easy to apply. This is an optimistic assumption that can end in tears. What happens when the data is sourced from a complex relational system? This can result in lay users feeling helpless and lost in an unfriendly thicket of metadata. Even if Power BI can detect relationships between tables, and lets you, the developer, rename tables and columns to make the result seem less intimidating, an OLTP system is rarely designed to optimize end-user happiness.<\/p>\n<p>Things can get even worse if your data source is an ERP system. Whatever the source, and even if the table names are in English \u2013 the metadata is largely incomprehensible and any traditional table relationships are probably absent.<\/p>\n<p>Even when the database is relatively straightforward, you could have a valid reason to want to present data differently. Perhaps you want to try out a data warehouse topology or apply a rapid application development process to a Kimball model before you launch into the creation of a full-blooded enterprise data warehouse. Maybe you want to experiment with a Data Vault approach. Whatever the reason, adapting or refining the data landscape could be either an option or a necessity.<\/p>\n<p>It follows that if you are going to attempt a self-service BI solution you are probably going to have to think in terms of adding a metadata layer to the source data to render the information accessible to users. This is the approach that is outlined in Figure 2:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2016\/09\/word-image-378.png\" width=\"639\" height=\"81\" \/><\/p>\n<p class=\"caption\">Figure 2.<\/p>\n<p>As any BI practitioner will immediately note, there is nothing original about this approach. It is probably what most BI architects and developers have been doing with the SQL Server stack for 15 years or more. So why not apply it to a Power BI solution?<\/p>\n<h2>Full-Stack BI with Power BI<\/h2>\n<p>It may come as a surprise to some, but Power BI can \u2013 with a little application \u2013 become a full-stack BI solution. This is because even though it is a single tool, it lets you carry out the key steps in a BI process. The table below lets you compare Power BI with the SQL Server BI stack:<\/p>\n<table>\n<tbody>\n<tr>\n<td>\n<p><strong>Segment<\/strong><\/p>\n<\/td>\n<td>\n<p><strong>SQL Server Stack<\/strong><\/p>\n<\/td>\n<td>\n<p><strong>Power BI<\/strong><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong>Data Load <\/strong><\/p>\n<\/td>\n<td>\n<p>SSIS<\/p>\n<\/td>\n<td>\n<p>Power BI Query Editor<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong>Data Modelling<\/strong><\/p>\n<\/td>\n<td>\n<p>SSAS (SSDT)<\/p>\n<\/td>\n<td>\n<p>Power BI Query Editor \/ Power BI<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong>Metrics<\/strong><\/p>\n<\/td>\n<td>\n<p>SSAS (SSDT)<\/p>\n<\/td>\n<td>\n<p>Power BI \/ DAX<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong>Reporting<\/strong><\/p>\n<\/td>\n<td>\n<p>SSRS (Paginated and Mobile)<\/p>\n<\/td>\n<td>\n<p>Power BI<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong>Distribution<\/strong><\/p>\n<\/td>\n<td>\n<p>SharePoint<\/p>\n<\/td>\n<td>\n<p>PowerBI.Com<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Simply put, Power BI will let you do nearly everything that you can do using the traditional SQL Server BI toolkit.<\/p>\n<p>The next question to ask is why you should want to use a simple, free tool to replace an industrial-strength tried and tested approach. The answer is, of course, much more nuanced. You might not want to replace an existing system. Yet you could want to:<\/p>\n<ul>\n<li>Develop a proof of concept data warehouse in a shorter time-frame<\/li>\n<li>Deliver a specific and tailored solution to a subset of users<\/li>\n<li>Avoid extending a corporate data warehouse to empower only a small group of users where the complexity is disproportionate<\/li>\n<li>Deliver a targeted solution for a specific group of users where a corporate solution is not cost-effective<\/li>\n<li>Allow users to access data from outside the specific corporate data that they use traditionally \u2013 specifically Big Data sources<\/li>\n<li>Model and test new data sources \u2013 and mix OLAP, OLTP and Big Data sources<\/li>\n<\/ul>\n<p>This list could go on. However I hope that some of these ideas will strike a chord with readers.<\/p>\n<p>The breadth of the capabilities that Power BI offers allows you to perform all of the following everyday BI tasks:<\/p>\n<p><strong>Data Load<\/strong><\/p>\n<ul>\n<li>Data Profiling<\/li>\n<li>Load Sequencing<\/li>\n<li>Data Type Conversion<\/li>\n<li>Data Lookups<\/li>\n<li>Relational to dimensional conversion<\/li>\n<\/ul>\n<p><strong>Data Modelling &amp; Metrics<\/strong><\/p>\n<ul>\n<li>Schema design:\u2013 Dimensional vs. tabular<\/li>\n<li>Semantic layer (rename objects for greater clarity)<\/li>\n<li>Hierarchies<\/li>\n<li>KPIs<\/li>\n<li>Calculated metrics<\/li>\n<\/ul>\n<p><strong>Presentation Layer<\/strong><\/p>\n<ul>\n<li>Compare visualization types \/ test chart types<\/li>\n<li>Mock-up reports \/ dashboards (element assembly)<\/li>\n<li>Test Filters (&amp; Slicers)<\/li>\n<li>Try out Hierarchical Drilldown<\/li>\n<li>Define user interaction requirements (Self-Service vs. Enterprise BI)<\/li>\n<\/ul>\n<h2>Using Power BI Desktop to Apply a Logical Dimensional Layer<\/h2>\n<p>Although I haven\u2019t the space to go into all the detail and to examine all the possible variations on the theme of BI, all the tasks listed above are perfectly possible using Power BI. For the moment we will concentrate on a few core elements of the stack to apply a logical dimensional structure to a relational data source, adapt the semantic layer and then add a couple of calculated metrics and a few hierarchies. This will allow you to appreciate some of the more well-known (as well as one or two of the less well-known) aspects of the Power BI Query Editor in Power BI Desktop. What we will see includes:<\/p>\n<ul>\n<li>Joining tables at query level (as opposed to doing this in the Power BI Relationships view)<\/li>\n<li>Using hidden intermediate queries as a data staging area<\/li>\n<li>Generating and applying surrogate keys in the Power BI Query Editor<\/li>\n<li>Renaming queries and fields<\/li>\n<li>Adding hierarchies<\/li>\n<li>Creating calculated measures<\/li>\n<\/ul>\n<p>This article assumes that you already have some knowledge of Power BI Desktop and the Power BI Query Editor, but what follows should not, I hope, deter even beginners. After all, this tool is ostensibly designed for ease of use, and so even moderately complex modifications should be comprehensible by neophytes.<\/p>\n<h3>The Relational Schema<\/h3>\n<p>The sample data for this article is an OLTP database (albeit a very simple one) that contains six tables. These are illustrated in the following figure:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2016\/09\/word-image-379.png\" \/><\/p>\n<p class=\"caption\">Figure 3<\/p>\n<h3>The Desired Dimensional Schema<\/h3>\n<p>The relational data has, inevitably, been optimized for transactional processing. Consequently the fields that users will need for analytics are distributed among several tables. What most users want is a simplified presentation layer that delivers attributes and metrics in classic star schema. In fact we could imagine that the CIO of the company just returned from a power lunch with a suite of highly-paid external consultants bearing a napkin containing the ideal BI architecture. This is shown in Figure 4:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2016\/09\/word-image-380.png\" width=\"620\" height=\"432\" \/><\/p>\n<p class=\"caption\">Figure 4<\/p>\n<p>Clearly this design is not a fully-fledged enterprise data architecture. However it is enough for the purposes of this article, as it will let us overlay the existing relational design with a radically different data topology and then extend it with a time dimension.<\/p>\n<h2>Loading the Data<\/h2>\n<p>As a first step in creating a star schema over a relational model we need to load the data. This example will use the sample database <strong>CarSalesData<\/strong>. If you prefer to save yourself the trouble of creating a database and running the script to populate the tables, the data for these tables is also in a spreadsheet named <strong>CarSales_Tables<\/strong> that is supplied with this article. Power BI could, of course have sourced the data from any of the multitude of sources that it can handle.<\/p>\n<ol>\n<li>Launch Power BI Desktop and click \u2018<em>Get Data\u2019<\/em> from the splash screen.<\/li>\n<li>Click \u2018<em>Database\u2019<\/em>, \u2018<em>SQL<\/em> <em>Server<\/em> <em>Database\u2019<\/em> and \u2018<em>Connect<\/em>.\u2019 The SQL Server Database dialog will appear.<\/li>\n<li>Enter the server name, click \u2018<em>Import\u2019<\/em> and \u2018<em>OK\u2019<\/em>. The Navigator dialog will appear.<\/li>\n<li>Expand the <strong>CarSalesData<\/strong> database and select the six tables that are in the ERD diagram in Figure 3. The Navigator dialog will look something like the one in Figure 5.\n<p><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2016\/09\/word-image-381.png\" \/><\/p>\n<p class=\"caption\">Figure 5.<\/p>\n<\/li>\n<li>Click \u2018<em>Edit\u2019<\/em>. The Power BI Query Editor will open and display the tables that you have selected. You can see this in Figure 6.\n<p><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2016\/09\/word-image-382.png\" \/><\/p>\n<p class=\"caption\">Figure 6<\/p>\n<\/li>\n<li>In the Queries pane on the left of the screen, right-click on each source data table individually and uncheck \u2018<em>Enable<\/em> <em>Load\u2019<\/em>. This will make the source tables into \u201cintermediate\u201d or staging tables that will not be visible to end users but that can nonetheless be used as a basis for the data transformations that will be applied later. The table names will appear in italics in the Queries pane on the left.<\/li>\n<li>Rename all the queries to remove the \u201cData\u201d prefix.<\/li>\n<\/ol>\n<p>The first and fairly painless stage is finished. You now have the relational data in Power BI Desktop ready for dimensional modelling.<\/p>\n<h2>Creating the Vehicle Dimension<\/h2>\n<p>A quick look at the source data shows that the attributes describing vehicles can be found in a couple of tables \u2013 <strong>Stock<\/strong> and <strong>Colors<\/strong>. So we need to isolate the required attributes from these tables and create a single \u201cvirtual table\u201d (which is really another query) that will be visible to the user as the <strong>Vehicle<\/strong> dimension.<\/p>\n<ol>\n<li>Right-click on the <strong>Stock<\/strong> query and select \u2018<em>Reference\u2019<\/em>. This will create a copy of the <strong>Stock<\/strong> query that will depend for its source data on the source query.<\/li>\n<li>In the Query Settings pane on the right rename the Stock (2) query <strong>Dim_Vehicle<\/strong>.<\/li>\n<li>Leaving the <strong>Dim_Vehicle<\/strong> query selected, click \u2018<em>Merge<\/em> <em>Queries\u2019<\/em>. When the \u2018<em>Merge\u2019<\/em> dialog appears click on the <strong>ColorID<\/strong> column in the upper part of the dialog.<\/li>\n<li>Select the <strong>Colors<\/strong> query from the popup list and click on the <strong>ColorID<\/strong> column in the lower part of the dialog.<\/li>\n<li>Define the Join Kind as Inner. You will see something like Figure 7.\n<p><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2016\/09\/word-image-383.png\" \/><\/p>\n<p class=\"caption\">Figure 7<\/p>\n<\/li>\n<li>Click <em>\u2018OK\u2019<\/em>. A new column (named <strong>NewColumn<\/strong>) will appear at the right of the data table.<\/li>\n<li>Click the icon at the right of the new column. In the popup select the \u2018<em>Expand\u2019<\/em> radio button, uncheck \u2018<em>ColorID\u2019<\/em> and also uncheck \u2018<em>Use original column name as prefix\u2019<\/em>. The dialog will look like the following figure:\n<p><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2016\/09\/word-image-384.png\" \/><\/p>\n<p class=\"caption\">Figure 8<\/p>\n<\/li>\n<li>Click <em>\u2018OK\u2019<\/em>. The Color column from the <strong>Colors<\/strong> query will be added to the <strong>Dim_Vehicle<\/strong> query. You have, in effect, created a \u201cview\u201d based on the two queries.<\/li>\n<li>Control-click to select the following columns: <strong>Make<\/strong>, <strong>Model<\/strong>, <strong>VehicleType<\/strong> and <strong>Color<\/strong>. Then right-click on any of the selected columns and choose \u2018<em>Remove other columns\u2019<\/em>. This will leave you with a data table containing four columns only. These columns are the attributes required by the <strong>Vehicle<\/strong> dimension.<\/li>\n<li>Select all four columns in the table and then, in the Home ribbon click Remove Rows and then Remove Duplicates. Only unique records will remain in the table.<\/li>\n<li>In the \u2018<em>Add<\/em> <em>Column\u2019<\/em> ribbon, click the popup triangle in the \u2018<em>Add index Column\u2019<\/em> button. Select From 1. A new column containing a unique monotonically increasing identifier will be added. This will be the surrogate key.<\/li>\n<li>Right-click on the new column and rename it <strong>VehicleSK<\/strong>. The final dimension query will look like the one in the following figure:\n<p><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2016\/09\/word-image-385.png\" \/><\/p>\n<p class=\"caption\">Figure 9<\/p>\n<\/li>\n<\/ol>\n<h2>Creating the Sales Fact Table<\/h2>\n<p>Let us now jump straight to the creation of the fact table that shows all the car sales from the source database. Here is how to set this up:<\/p>\n<ol>\n<li>Right-click on the <strong>InvoiceLines<\/strong> query and select \u2018<em>Reference\u2019<\/em>. This will create a copy of the source query. The newly-created query will use the original query as a data source.<\/li>\n<li>Right click on the reference table that you just created and select \u2018<em>Rename\u2019<\/em>. Call it Fact_Sales.<\/li>\n<li>Leaving the <strong>Fact_Sales<\/strong> query selected, in the Home ribbon click \u2018<em>Merge<\/em> <em>Queries\u2019<\/em>. When the \u2018<em>Merge\u2019<\/em> dialog appears click on the <strong>StockID<\/strong> column in the upper part of the dialog.<\/li>\n<li>Select the <strong>Stock<\/strong> query from the popup list and click on the <strong>StockID<\/strong> column in the lower part of the dialog. Set the Join Kind as Inner. Click <em>\u2018OK\u2019<\/em>.<\/li>\n<li>Click the icon at the right of the new column. In the popup select the \u2018<em>Expand\u2019<\/em> radio button, uncheck \u2018<em>Select<\/em> <em>all<\/em> <em>columns\u2019<\/em> and also uncheck \u2018<em>Use original column name as prefix<\/em>\u2019. Select the <strong>Make<\/strong>, <strong>Model<\/strong>, <strong>VehicleType<\/strong> and <strong>ColorID<\/strong> columns, then click <em>\u2018OK\u2019<\/em>.<\/li>\n<li>Click \u2018<em>Merge<\/em> <em>Queries\u2019<\/em>. When the Merge dialog appears click on the <strong>NewColumn.ColorID<\/strong> column in the upper part of the dialog.<\/li>\n<li>Select the <strong>Colors<\/strong> query from the popup list and click on the <strong>ColorID<\/strong> column in the lower part of the dialog. Set the Join Kind as Inner. Click <em>\u2018OK\u2019<\/em>.<\/li>\n<li>Click the icon at the right of the new column. In the popup select the \u2018<em>Expand\u2019<\/em> radio button, uncheck \u2018<strong>ColorID\u2019<\/strong> and also uncheck \u2018<em>Use original column name as prefix<\/em>\u2019. Select the <strong>Make<\/strong>, <strong>Model<\/strong>, <strong>VehicleType<\/strong> and <strong>ColorID<\/strong> columns, then click <em>\u2018OK\u2019<\/em>.<\/li>\n<li>Click \u2018<em>Merge<\/em> <em>Queries\u2019<\/em>. When the \u2018<em>Merge\u2019<\/em> dialog appears click on the <strong>NewColumn.Make<\/strong>, <strong>NewColumn.Model<\/strong>, <strong>NewColumn.VehicleType<\/strong> and <strong>NewColumn.Color<\/strong> columns in this order.<\/li>\n<li>Select the <strong>Dim_Vehicle<\/strong> query from the popup list of available queries.<\/li>\n<li>Click on the <strong>Make<\/strong>, <strong>Model<\/strong>, <strong>VehicleType<\/strong> and <strong>Color<\/strong> columns in this order in the lower part of the dialog, then click <em>\u2018OK\u2019<\/em>. You are joining the two queries using multiple fields.<\/li>\n<li>Click the icon at the right of the new column. In the popup select the \u2018<em>Expand\u2019<\/em> radio button, uncheck \u2018<em>Select all columns\u2019<\/em> and select only the <strong>VehicleSK<\/strong> column. Click <em>\u2018OK\u2019<\/em>.<\/li>\n<li>Select the <strong>VehiclePrice<\/strong> and <strong>VehicleSK<\/strong> in the <strong>Fact<\/strong>_<strong>Sales<\/strong> table. Right-click on any of these columns and select \u2018<em>Remove<\/em> <em>other<\/em> <em>columns\u2019<\/em>. The fact table will look like it does in the following Figure:\n<p><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2016\/09\/word-image-386.png\" \/><\/p>\n<p class=\"caption\">Figure 10<\/p>\n<\/li>\n<\/ol>\n<p>This process let you detect and add the <strong>VehicleSK<\/strong> field from the vehicle dimension to the fact table. I imagine that most BI practitioners have carried out these kinds of operations using SSIS and T-SQL many times in their careers.<\/p>\n<h2>Adding a Time Dimension<\/h2>\n<p>It is hard to imagine a data warehouse \u2013 even a tiny model like the one that you are seeing here \u2013 without a time dimension. So let\u2019s see how to add this to the model in a couple of minutes.<\/p>\n<ol>\n<li>In Data View, activate the \u2018<em>Modeling\u2019<\/em> ribbon and click the \u2018<em>New<\/em> <em>Table\u2019<\/em> button. The expression Table = will appear in the Formula Bar.<\/li>\n<li>Replace the word \u2018Table\u2019 with \u2018DateDimension\u2019.<\/li>\n<li>Click to the right of the equals sign and enter the following DAX function\n<pre>DateDimension = CALENDAR( \"1\/1\/2012\", \"31\/12\/2016\" )<\/pre>\n<\/li>\n<li>Press Enter or click the tick icon in the Formula Bar. Power BI Desktop will create a table containing a single column of dates from the 1st of January 2012 until the 31st of December 2016.<\/li>\n<li>In the Fields list, right-click on the <strong>Date<\/strong> field in the <strong>DateDimension<\/strong> table and select \u2018<em>Rename\u2019<\/em>. Rename the <strong>Date<\/strong> field to <strong>DateSK<\/strong>.<\/li>\n<li>Add five new columns containing the formulas shown in the table below:<br \/>\n<table>\n<tbody>\n<tr>\n<td>\n<p>Column Title<\/p>\n<\/td>\n<td>\n<p>Formula<\/p>\n<\/td>\n<td>\n<p>Comments<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>FullYear<\/p>\n<\/td>\n<td>\n<p>YEAR([DateSK])<\/p>\n<\/td>\n<td>\n<p>Isolates the year as a four digit number<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Quarter<\/p>\n<\/td>\n<td>\n<p>&#8220;Q&#8221; &amp;ROUNDDOWN(MONTH([DateSK])\/4,0)+1<\/p>\n<\/td>\n<td>\n<p>Displays the current quarter in short form<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>QuarterNumber<\/p>\n<\/td>\n<td>\n<p>ROUNDDOWN(MONTH([DateSK])\/4,0)+1<\/p>\n<\/td>\n<td>\n<p>Displays the number of the current quarter. This is essentially used as a sort by column<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>MonthFull<\/p>\n<\/td>\n<td>\n<p>FORMAT([DateSK], &#8220;MMMM&#8221;)<\/p>\n<\/td>\n<td>\n<p>Displays the full name of the month<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>MonthNumber<\/p>\n<\/td>\n<td>\n<p>MONTH([DateSK])<\/p>\n<\/td>\n<td>\n<p>Isolates the number of the month in the year as one or two digits<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/li>\n<li>Select the <strong>Quarter<\/strong> column. In the Modeling ribbon click the popup triangle in the \u2018<em>Sort<\/em> <em>By<\/em> <em>Column\u2019<\/em> button and select <strong>QuarterNumber<\/strong>.<\/li>\n<li>Select the <strong>FullMonth<\/strong> column. In the \u2018<em>Modeling\u2019<\/em> ribbon click the popup triangle in the \u2018<em>Sort<\/em> <em>By<\/em> <em>Column\u2019<\/em> button and select <strong>MonthNumber<\/strong>\n<p>In case this short sequence seems a little cabbalistic, let me explain what you have done:<\/p>\n<ul>\n<li>Using the DAX formula <strong>CALENDAR()<\/strong> you specified a range of dates for Power BI to generate a table containing a continuous date range.<\/li>\n<li>You added fields to display quarter and month \u2013 as well as the numbers for these items that are used as sort indicators.<\/li>\n<li>Finally you applied the sort order to any non-numeric columns. This prevents the month names appearing in alphabetical order.<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<p>Your Date dimension is now complete. It is, admittedly, an extremely shortened version of the kind of fully-fledged table that you would need in a large-scale application. However it is enough to make the design point that underlies this article.<\/p>\n<h2>Finishing the Data Model<\/h2>\n<p>Now let\u2019s create the dimensional model \u2013 albeit with only one dimension for the moment, as I am keener on clarifying the principle that you can then employ for most other dimensions rather than wading through all the detail.<\/p>\n<ol>\n<li>Assuming that you are still in the Power BI Desktop Query Editor, click \u2018<em>Close\u2019<\/em> and \u2018<em>Apply\u2019<\/em> to return to the Report view.<\/li>\n<li>Click on the Relationships icon on the left (the third icon down). You should see the fact and dimension table joined on the <strong>VehicleSK<\/strong> field. The initial model will look like it does in the following Figure:\n<p><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2016\/09\/word-image-387.png\" \/><\/p>\n<p class=\"caption\">Figure 11<\/p>\n<\/li>\n<\/ol>\n<p>Yes, that is right, you have nothing more to do. Power BI Desktop has guessed that the queries (that are now, for Power BI, tables) are designed to be joined on a specific field and has helpfully added the relationship between the tables.<\/p>\n<p>The remaining two dimensions can be created using exactly the same techniques that you saw in the section describing how to create the vehicle dimension. So I will not describe all the steps laboriously, but prefer to refer you to the finished model that you can find in the <strong>PowerBiForDataModelling.Pbix<\/strong> file that accompanies this article.<\/p>\n<p>The finished data model that you can view in the Power BI Desktop Relationships view looks like the one in Figure 12, below. As you can see, we have created a practical implementation based on the high-level (that is, napkin-based) starting point that you saw above in Figure 4:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2016\/09\/word-image-388.png\" \/><\/p>\n<p class=\"caption\">Figure 12<\/p>\n<p>Power BI Desktop might not, in all cases guess the relationships correctly in a complex model. However adding a new relationship is as easy as dragging the surrogate key field from the fact table to the dimension (or vice-versa). Equally easily you can delete any erroneous relationships by right clicking on the relationship and selecting Delete.<\/p>\n<h2>Finalizing the Data Model<\/h2>\n<p>The data model still needs a few tweaks to make it truly user-friendly and ready for self-service BI. A couple of things to do are:<\/p>\n<ul>\n<li>Hide the surrogate keys and sort by columns.<\/li>\n<li>Add a vehicle hierarchy and a time hierarchy.<\/li>\n<li>Create a few calculated measures as simple examples of what can be done to extend a data model.<\/li>\n<\/ul>\n<h3>Hide the surrogate keys<\/h3>\n<p>This is extremely simple, but nonetheless necessary. All you have to do, in any of the Power BI views in Power BI Desktop is to right-click on the field that you want to mask and select \u2018<em>Hide\u2019<\/em> in Report View. The surrogate key field will no longer be visible to users. They will, however appear in italics in \u2018<em>Relationship<\/em> <em>and<\/em> <em>data\u2019<\/em> view. You have to do this not only for all the surrogate keys in all the tables, but also for the <strong>QuarterNumber<\/strong> and <strong>MonthNumber<\/strong> fields in the <strong>DateDimension<\/strong> table.<\/p>\n<h3>Add Hierarchies<\/h3>\n<p>Hierarchies are a traditional metadata structure in BI, and Power BI Desktop now (from the March 2016 update) allows you to create them.<\/p>\n<ol>\n<li>In \u2018Report\u2019 view, ensure that the \u2018Fields\u2019 list is visible on the right.<\/li>\n<li>Expand the <strong>Dim_Vehicle<\/strong> table.<\/li>\n<li>Right-click on the <strong>Make<\/strong> field and select \u2018<em>New<\/em> <em>Hierarchy\u2019<\/em>.<\/li>\n<li>Right-click on the <strong>Model<\/strong> field and select \u2018<em>Add<\/em> <em>to<\/em> <em>Make<\/em> <em>hierarchy\u2019<\/em>.<\/li>\n<li>Right-click on the original <strong>Make<\/strong> and <strong>Model<\/strong> fields (outside the hierarchy that you just created) and select \u2018<em>Hide\u2019<\/em>. This way these fields will only be visible in the hierarchy.<\/li>\n<\/ol>\n<p>Now you have a ready to use parent-child hierarchy that can be applied to tables, matrices and charts.<\/p>\n<p>Creating a time hierarchy is a virtually identical process. You start with the <strong>FullYear<\/strong> field as the basis for a new hierarchy in the <strong>DataDimension<\/strong> table and then add the <strong>Quarter<\/strong> and <strong>Month Full<\/strong> fields. Finally you hide these latter two fields outside the hierarchy.<\/p>\n<h3>Create calculated measures<\/h3>\n<p>BI inevitably involves adding further analytical calculations to the data model. As a full demonstration of DAX (the language that Power BI uses for calculating metrics) is beyond the confines of this article, I merely want to demonstrate the principle here.<\/p>\n<ol>\n<li>In \u2018<em>Data\u2019<\/em> view, ensure that the \u2018<em>Fields\u2019<\/em> list is visible on the right.<\/li>\n<li>Select the <strong>Fact_Sales <\/strong>table in the \u2018<em>Fields\u2019<\/em> list.<\/li>\n<li>In the Home ribbon, click the popup triangle in the \u2018<em>New Measure\u2019<\/em> button and select \u2018<em>New Column\u2019<\/em>. A column, enticingly named \u201cColumn\u201d will appear in the <strong>Fact_Sales<\/strong> table at the right of any existing columns.<\/li>\n<li>Click inside the formula bar above the table and enter the following formula:\n<pre>TotalCosts = [CostPrice]-[SpareParts]-[LaborCost]<\/pre>\n<\/li>\n<li>Press Enter \u2013 or click the tick icon in the formula bar.<\/li>\n<\/ol>\n<p>The new column will calculate the total cost for every record in the fact table.<\/p>\n<p>As a final flourish I want to add some Time Intelligence to the model. So<\/p>\n<ol>\n<li>In \u2018<em>Data\u2019<\/em> view, ensure that the \u2018<em>Fields\u2019<\/em> list is visible on the right.<\/li>\n<li>Select the <strong>Fact_Sales <\/strong>table in the \u2018<em>Fields\u2019<\/em> list.<\/li>\n<li>In the \u2018<em>Home\u2019<\/em> ribbon, click the popup triangle in the \u2018<em>New<\/em> <em>Measure\u2019<\/em> button and select \u2018<em>New<\/em> <em>Measure\u2019<\/em>.<\/li>\n<li>Enter the following Measure\n<pre>QuarterSales = TOTALQTD(SUM(Fact_Sales[SalePrice]),DateDimension[DateSK])<\/pre>\n<\/li>\n<li>In the \u2018<em>Home\u2019<\/em> ribbon, click the popup triangle in the \u2018<em>New<\/em> <em>Measure\u2019<\/em> button and select \u2018<em>New<\/em> <em>Measure\u2019<\/em>.<\/li>\n<li>Enter the following Measure\n<pre>YearSales = TOTALYTD(SUM(Fact_Sales[SalePrice]),DateDimension[DateSK])<\/pre>\n<\/li>\n<\/ol>\n<p>You now have a couple of MDX measures that will calculate month to date and year to date sales.<\/p>\n<p>Now that the data model is finished, you should see all the measures and attributes that are displayed in the following figure \u2013 which is the Field List from the Report View in Power BI Desktop:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2016\/09\/word-image-389.png\" \/><\/p>\n<p class=\"caption\">Figure 12<\/p>\n<p>You can now save and close the Power BI Desktop Query Editor and begin using the available attributes and measures to create reports and dashboards. The sample data enabled me to create the dashboard that you can see in the following figure in approximately 3 minutes.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2016\/09\/word-image-390.png\" \/><\/p>\n<p class=\"caption\">Figure 14<\/p>\n<ol>\n<li>There are a few points that I must make now that the star schema model has been created over the initial relational model:<\/li>\n<li>You might not need to create reference tables in reality \u2013 a finalized star schema might adapt certain tables rather than initially setting them to be reference tables. However this approach can be a valid phase when initially designing a radically different data topology over a relational model.<\/li>\n<li>Occasionally (as is the case with the <strong>ClientID<\/strong> field in this data model) you can use an ID as a surrogate key. Just remember to do this quietly and not to mention the fact to any Kimball purists or you will never hear the end of it.<\/li>\n<li>The tough part can be flattening the fact table so that it initially contains all the required fields that are necessary to map the fact table to each dimension table and deduce the appropriate surrogate key. However this becomes easier with practice, and is certainly easier if you know the source data model well.<\/li>\n<li>You may find that you have to use the \u201cM\u201d language in Power BI Desktop Query Editor to perform certain calculations if these are tightly bound to the relational data structure. An example can be discretizing data \u2013 in other words reducing mileage to \u201cbuckets\u201d of mileage ranges, for instance. While you cannot hope to attain the ease and power of DAX in this environment, you can certainly carry out most of the kinds of calculations that you could perform using T-SQL when preparing a dimensional structure in SQL Server ready to be loaded into \u201cClassic\u201d SSAS.<\/li>\n<\/ol>\n<h2>The Limits to Power BI Data Modelling<\/h2>\n<p>The small example of in-process data modelling is, inevitably, an extremely simple one. Yet it hopefully suffices to make the point not only that this can be done, but that it can be performed relatively easily.<\/p>\n<p>However I do not want the extreme simplicity of the approach to mask the sweep of data transformations that are available when modelling data with Power BI Desktop Query Editor. Joining tables and aggregating datasets are only a start. To give you a more comprehensive overview of what can be done, take a look at the following table that compares the data transformation capabilities of both SSIS and the Power BI Desktop Query Editor.<\/p>\n<table>\n<tbody>\n<tr>\n<td>\n<p><strong>Data Transformation <\/strong><\/p>\n<\/td>\n<td>\n<p><strong>SSIS<\/strong><\/p>\n<\/td>\n<td>\n<p><strong>Power Query<\/strong><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Pivot<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Unpivot<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Case Transformation<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Merge datasets<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Join datasets<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Deduplicate datasets<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Lookup<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Group and Aggregate<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Select Columns<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Exclude Rows<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Sort<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Split Columns<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Deduplicate<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Subset datasets<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Concatenate<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Split fields<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Filter Dataset<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Derived column<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Fuzzy Matching<\/p>\n<\/td>\n<td>\n<p>YES<\/p>\n<\/td>\n<td>\n<p>NO<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>These are, inevitably, the good points. When it comes to heavy-duty processing, SSIS clearly comes out on top, because Power BI offers:<\/p>\n<ul>\n<li>No Logging<\/li>\n<li>No monitoring<\/li>\n<li>No complex error handling<\/li>\n<li>No Data Profiling<\/li>\n<li>Little flow logic<\/li>\n<li>No parallel processing<\/li>\n<li>Large dataset limitations<\/li>\n<\/ul>\n<p>It follows that enterprise BI will remain the domain of the SQL Server toolkit for some time to come, and this is not in dispute. However I hope that, after a short time spent testing the data modelling capabilities of Power BI, even the most seasoned SQL Server professional will not be tempted to dismiss it out of hand as a lightweight tool for power users only.<\/p>\n<h2>Power BI Data Models for In-Memory Analytics<\/h2>\n<p>Earlier in this article I listed a few potential uses for Power BI in the enterprise. These included testing data integration approaches and trialing BI approaches. I want to finish with one more idea \u2013 and one that could prove to be a justification (as if any were needed) for rolling out SQL Server 2016 for in-memory analytics.<\/p>\n<p>In-memory tables have reached a degree of maturity in the 2016 release of SQL Server. From a BI perspective they have made a quantum leap, with the addition of clustered columnstore indexes. These deliver OLTP speed allied the kind of with extreme compression ratios that, in their turn, allow for data warehouse-level quantities of data to be contained in memory \u2013 and so accessible nearly always instantaneously when using Power BI Desktop in Direct Query mode.<\/p>\n<p>Power BI on top of these structures adds the final piece of the puzzle \u2013 it provides an easy to use and powerful user interface for analytics and reporting using the source data.<\/p>\n<p>The only cloud in this otherwise clear blue sky is that <strong>some<\/strong> of the transformations that are necessary to overlay a different logical layer do <strong>not<\/strong> (yet) work in direct query mode. This means that it is not currently possible to create a scenario where data is <strong>not<\/strong> downloaded into Power BI Desktop, but read directly from SQL Server memory while presenting a separate logical data layer to the user. The specific function that blocks the kind of transformation that we used above is the generation of an index column. This is, unfortunately not possible when using a direct connection. However, given the continuing evolution of Power BI we can live in hope that this possibility \u2013 or a suitable workaround \u2013 may appear soon.<\/p>\n<p>The result nonetheless is near-real-time analytics from transactional data presented in a comprehensible way to users. Self-service BI may just have made a great leap forward.<\/p>\n<div class=\"note\">\n<p><strong>Pro Power BI Desktop<\/strong><br \/>\n If you like what you read in this article and want to learn more about using Power BI Desktop, then please take a look at my book, <a href=\"http:\/\/www.amazon.com\/Pro-Power-Desktop-Adam-Aspin\/dp\/1484218043\">Pro Power BI Desktop<\/a> (Apress, May 2016).<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Power BI has a lot of uses, but  one of the most interesting of them is to model and shape data to make it easier for self-service BI.  Power BI is a full-stack solution that includes data loading, data modelling, metrics, reporting and distribution.  It can take the source data, and perform  in-process data modelling relatively easily as well a providing an easy to use and powerful user interface for analytics and reporting.&hellip;<\/p>\n","protected":false},"author":2181,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143528],"tags":[],"coauthors":[12351],"class_list":["post-68154","post","type-post","status-publish","format-standard","hentry","category-bi-sql-server"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/68154","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/2181"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=68154"}],"version-history":[{"count":6,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/68154\/revisions"}],"predecessor-version":[{"id":92439,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/68154\/revisions\/92439"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=68154"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=68154"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=68154"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=68154"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}