{"id":79059,"date":"2018-06-06T12:42:35","date_gmt":"2018-06-06T12:42:35","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=79059"},"modified":"2021-09-15T13:22:15","modified_gmt":"2021-09-15T13:22:15","slug":"power-bi-introduction-working-with-power-bi-desktop-part-2","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/bi-sql-server\/power-bi-introduction-working-with-power-bi-desktop-part-2\/","title":{"rendered":"Power BI Introduction: Working with Power BI Desktop \u2014 Part 2"},"content":{"rendered":"<p><strong>The series so far:<\/strong><\/p>\n<ol>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/power-bi-part-1-introduction\/\">Power BI Introduction: Tour of Power BI \u2014 Part 1<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/power-bi-introduction-working-with-power-bi-desktop-part-2\/\">Power BI Introduction: Working with Power BI Desktop \u2014 Part 2<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/power-bi-introduction-working-with-r-scripts-in-power-bi-desktop-part-3\/\">Power BI Introduction: Working with R Scripts in Power BI Desktop \u2014 Part 3<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/power-bi-introduction-working-with-parameters-in-power-bi-desktop-part-4\/\">Power BI Introduction: Working with Parameters in Power BI Desktop \u2014 Part 4<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/power-bi-introduction-working-with-sql-server-data-in-power-bi-desktop-part-5\/\">Power BI Introduction:\u00a0Working with SQL Server data in Power BI Desktop \u2014 Part 5<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/homepage\/power-bi-introduction-power-query-m-formula-language-in-power-bi-desktop-part-6\/\">Power BI Introduction:\u00a0Power Query M Formula Language in Power BI Desktop \u2014 Part 6<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/power-bi-introduction-building-reports-in-power-bi-desktop-part-7\/\">Power BI Introduction: Building Reports in Power BI Desktop \u2014 Part 7<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/power-bi-introduction-publishing-reports-to-the-power-bi-service-part-8\/\">Power BI Introduction: Publishing Reports to the Power BI Service \u2014 Part 8<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/power-bi-introduction-visualizing-sql-server-audit-data-part-9\/\">Power BI Introduction: Visualizing SQL Server Audit Data \u2014 Part 9<\/a><\/li>\n<\/ol>\n\n<p>Microsoft\u2019s Power BI is a comprehensive suite of integrated business intelligence (BI) tools for retrieving, consolidating, and presenting data through a variety of rich visualizations. One of the suite\u2019s most important tools is Power BI Desktop, a free, on-premises application for building comprehensive reports that can be published to the Power BI service or saved to Power BI Report Server, where they can be shared with users through their browsers, mobile apps, or custom applications.<\/p>\n<p>This article is the second in a series about Power BI. The first article covered the basic components that make up the Power BI suite. This article focuses specifically on Power BI Desktop because of the important role it plays in creating reports that can then be used by the other components.<\/p>\n<p>With Power BI Desktop, you can access a wide range of data sources, consolidate data from multiple sources, transform and enhance the data, and build reports that utilize the data. Although you can access data sources and create reports directly in the Power BI service, Power BI Desktop provides a far more robust environment for working with data and creating visualizations. You can then easily publish your reports to the Power BI service by providing the necessary credentials.<\/p>\n<p>The features in Power BI Desktop are organized into three views that you access from the navigation pane at the left side of the main window:<\/p>\n<ul>\n<li>\n<p><strong>Report view:<\/strong> A canvas for building and viewing reports based on the datasets defined in <em>Data<\/em> view.<\/p>\n<\/li>\n<li>\n<p><strong>Data view:<\/strong> Defined datasets based on data retrieved from one or more data sources. <em>Data<\/em> view offers limited transformation features, with many more capabilities available through Query Editor, which opens in a separate window.<\/p>\n<\/li>\n<li>\n<p><strong>Relationships view:<\/strong> Identified relationships between the datasets defined in <em>Data<\/em> view. When possible, Power BI Desktop identifies the relationships automatically, but you can also define them manually.<\/p>\n<\/li>\n<\/ul>\n<p>To access any of the three views, click the applicable button in the left navigation pane. The following figure shows Power BI Desktop with <em>Report<\/em> view selected, displaying a one-page report with three visualizations.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1344\" height=\"726\" class=\"wp-image-79092\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-21.png\" \/><\/p>\n<p>When working in Power BI Desktop, you will normally perform the following four basic steps, although not necessarily in a single session:<\/p>\n<ol>\n<li>\n<p>Connect to one or more data sources and retrieve the necessary data.<\/p>\n<\/li>\n<li>\n<p>Transform and enhance the data using <em>Data<\/em> view, <em>Relationships<\/em> view, or the Query Editor, as necessary.<\/p>\n<\/li>\n<li>\n<p>Create reports based on the transformed data, using <em>Report<\/em> view.<\/p>\n<\/li>\n<li>\n<p>Publish the reports to the Power BI service or upload them to Power BI Report Server.<\/p>\n<\/li>\n<\/ol>\n<p>This article will focus primarily on steps 1 and 2 so you can get a better sense of how to retrieve and prepare the data, which is essential to building comprehensive reports. Later in the series, you\u2019ll learn more about building reports with different types of visualizations, but first you need to make sure you get the data right.<\/p>\n<h2>Connecting to Data in Power BI Desktop<\/h2>\n<p>Power BI Desktop supports connectivity to a wide range of data sources, which are divided into the following categories:<\/p>\n<ul>\n<li>\n<p><strong>All:<\/strong> Every data source type available through Power BI.<\/p>\n<\/li>\n<li>\n<p><strong>File:<\/strong> Source files such as Excel, CSV, XML, or JSON.<\/p>\n<\/li>\n<li>\n<p><strong>Database:<\/strong> Database systems such as SQL Server, Oracle Database, IBM DB2, MySQL, SAP HANA, and Amazon Redshift.<\/p>\n<\/li>\n<li>\n<p><strong>Azure:<\/strong> Azure services such as SQL Database, SQL Data Warehouse, Blob Storage, and Data Lake Store.<\/p>\n<\/li>\n<li>\n<p><strong>Online Services:<\/strong> Non-Azure services such as Google Analytics, Salesforce Reports, Facebook, Microsoft Exchange Online, and the Power BI service.<\/p>\n<\/li>\n<li>\n<p><strong>Other:<\/strong> Miscellaneous data source types such as Microsoft Exchange, Active Directory, Hadoop File System, ODBC, OLE DB, and OData Feed.<\/p>\n<\/li>\n<\/ul>\n<p>To retrieve data from one of these data sources, click the <em>Get Data<\/em> button on the <em>Home<\/em> ribbon in the main Power BI window. This launches the <em>Get Data<\/em> dialog box, shown in the following figure.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"622\" height=\"676\" class=\"wp-image-79093\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-22.png\" \/><\/p>\n<p>To access a data source, navigate to the applicable category, select the source type, and click <em>Connect<\/em>. You will then be prompted to provide additional information, depending on the data source. This might include connectivity details, an instance or file name, or other types of information. After you provide the necessary details, Power BI Desktop will launch a preview window that will display a sample of the data along with additional options.<\/p>\n<p>This article uses a CSV file based on data from the Hawks dataset, which is available through a <a href=\"https:\/\/vincentarelbundock.github.io\/Rdatasets\/datasets.html\">GitHub collection of sample datasets<\/a>. The Hawks dataset contains data collected over several years from a hawk blind at Lake MacBride near Iowa City, Iowa. Name the file <em>hawks.csv<\/em> and save it to a local folder (<em>C:\\DataFiles<\/em>).<\/p>\n<p>To import the data from the file into Power BI Desktop, select the <em>Text\/CSV<\/em> data source type in the <em>Get Data<\/em> dialog box. Click <em>Connect<\/em> and the <em>Open<\/em> dialog box will appear. From there, navigate to the <em>C:\\DataFiles<\/em> folder, select the <em>hawks.csv<\/em> file, and click <em>Open.<\/em> This launches the preview window shown in the following figure.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"893\" height=\"671\" class=\"wp-image-79094\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-23.png\" \/><\/p>\n<p>In addition to being able to preview a subset of the data, you can configure several options: <em>File Origin<\/em> (the document\u2019s encoding), <em>Delimiter,<\/em> and <em>Data Type Detection<\/em>. In most cases, you\u2019ll be concerned primarily with the <em>Data Type Detection<\/em> option. By default, this is set to <em>Based on first 200 rows,<\/em> which means that Power BI Desktop will convert the data to Power BI data types based only on the first 200 rows of data.<\/p>\n<p>This might be okay in some cases, but there could be times when a column includes a value different from the majority of values, but that value is not in the first 200 rows. As a result, you could end up with a suboptimal data type or even an error. To avoid this risk, you can use the <em>Data Type Detection<\/em> option to instruct Power BI Desktop to base the data type selection on the entire dataset, or you can instead forego any conversion and keep all the data as string values.<\/p>\n<p>You can also edit the dataset before loading it into Power BI Desktop. When you click the <em>Edit<\/em> button, Power BI Desktop launches Query Editor, which provides a number of tools for transforming data. After you make the changes, you can then save the updated dataset.<\/p>\n<p>In this case, load the data into Power BI Desktop without changing any settings or editing the dataset. You will be transforming the data later as part of working through this article. To import the data as is, you need only click the <em>Load<\/em> button. You can then view the imported dataset in <em>Data<\/em> view, as shown in the following figure.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1344\" height=\"609\" class=\"wp-image-79095\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-24.png\" \/><\/p>\n<p>After you import a dataset into Power BI Desktop, you can use the data in your reports as is, or you can edit the data, applying various transformations and in other ways shaping the data. Although you can make some modifications in <em>Data<\/em> view, such as being able to add or remove columns or sort data, for the majority of transformations, you will need to use Query Editor, which includes a wide range of features for shaping data.<\/p>\n<h2>Introducing Query Editor<\/h2>\n<p>To launch Query Editor, click the <em>Edit Queries<\/em> button on the <em>Home<\/em> ribbon. Query Editor opens as a separate window from the main Power BI Desktop window and displays the contents of the dataset selected in the left pane, as shown in the following figure. In this case, the pane includes only the <em>hawks<\/em> dataset, which is the one you just imported.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1332\" height=\"703\" class=\"wp-image-79096\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-25.png\" \/><\/p>\n<p>The dataset itself is displayed in the main pane of the Query Editor window. You can scroll up and down or right and left to see all the data. The pane to the right, which is separated into two sections, provides additional information about the dataset. The <em>Properties<\/em> section displays the dataset\u2019s properties. In this case, the section includes only the <em>Name<\/em> property. To view all properties associated with a dataset, click the <em>All Properties<\/em> link within that section.<\/p>\n<p>The <em>Applied Steps <\/em>section lists the transformations that have been applied to the dataset in the order they were applied. Each step in this section is associated with a script statement written in the Power Query M formula language. The statement is shown in the formula bar directly above the dataset. (If the formula bar is not displayed, select the <em>Formula Bar<\/em> checkbox on the <em>View<\/em> ribbon to display it.)<\/p>\n<p>Currently, the <em>Applied Steps<\/em> section for the <em>hawks<\/em> dataset includes only two steps: <em>Source<\/em> and <em>Changed Type<\/em>. Both steps are generated by default when importing a CSV file. The <em>Source<\/em> step corresponds to the following M statement (provided here in case you cannot read it in the above figure):<\/p>\n<pre class=\"lang:none theme:none\">= Csv.Document(File.Contents(\"C:\\DataFiles\\hawks.csv\"),[Delimiter=\",\", Columns=20, Encoding=1252, QuoteStyle=QuoteStyle.None])<\/pre>\n<p>The M statement consists primarily of the connection string necessary to import the <em>hawks.csv<\/em> file. Notice that it includes the delimiter and encoding specified in the preview window, along with other details.<\/p>\n<p>The second step, <em>Changed Type,<\/em> converts the data in each column to Power BI data types. The step is associated with the following M statement, which shows that the first column is assigned the <strong>Int64<\/strong> type and that all other columns are assigned the <strong>text<\/strong> type:<\/p>\n<pre class=\"lang:none theme:none\">= Table.TransformColumnTypes(Source,{{\"Column1\", Int64.Type}, {\"Column2\", type text}, {\"Column3\", type text}, {\"Column4\", type text}, {\"Column5\", type text}, {\"Column6\", type text}, {\"Column7\", type text}, {\"Column8\", type text}, {\"Column9\", type text}, {\"Column10\", type text}, {\"Column11\", type text}, {\"Column12\", type text}, {\"Column13\", type text}, {\"Column14\", type text}, {\"Column15\", type text}, {\"Column16\", type text}, {\"Column17\", type text}, {\"Column18\", type text}, {\"Column19\", type text}, {\"Column20\", type text}})<\/pre>\n<p>If you had opted to forego data type conversions when importing the <em>hawks.csv<\/em> file, all columns would be assigned the <strong>text<\/strong> data type, but you instead stuck with the <em>Based on first 200 rows<\/em> option, which resulted in one column being converted to the <strong>Int64<\/strong> type. The following figure shows the dataset with the <em>Changed Type<\/em> step selected and its associated M statement in the formula bar.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1331\" height=\"705\" class=\"wp-image-79097\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-26.png\" \/><\/p>\n<p>This article won\u2019t spend much time on the M language. Just know that the language is the driving force behind the Power BI Desktop transformations in Query Editor. Articles later in this series will dig into the language in more detail so you can work with it directly.<\/p>\n<p>Every transformation you apply to a dataset is listed in the <em>Applied Steps<\/em> section, and each step is associated with an M statement that builds on the preceding step. You can use the steps and their associated M statements to work with the dataset in various ways. For example, if you select a step in the <em>Applied Steps<\/em> section, Query Editor will display the dataset as it existed when the step was applied. You can also modify steps, delete steps, inject steps in between others, or move steps up or down. In fact, the <em>Applied Steps<\/em> section is one of the most powerful tools you have for working with Power BI Desktop datasets. Be aware, however, that it\u2019s very easy to introduce an error into the steps that turns all your work into a big mess.<\/p>\n<p>With that in mind, you\u2019ll next provide column names for the dataset. You might have noticed from the screenshots\u2014or when you imported the data for yourself\u2014that the first row of data contains the original column names. This is why all columns but the first were assigned the <strong>text<\/strong> data type, even if most values were numbers. The first column was assigned the <strong>Int64<\/strong> data type because it was the index for the original dataset and did not require a column name.<\/p>\n<p>To convert the first row into columns names, click the <em>Use First Row as Headers<\/em> button on the <em>Home<\/em> ribbon. Power BI Desktop will promote the row and add a <em>Promoted Headers<\/em> step to the <em>Applied Steps<\/em> section, as well as a <em>Changed Type<\/em> step, which converts the data types of any fields that might be impacted by removing the column names as data values.<\/p>\n<p>The following M statement is associated with the <em>Changed Type<\/em> step added to the <em>hawks<\/em> dataset:<\/p>\n<pre class=\"lang:none theme:none\">= Table.TransformColumnTypes(#\"Promoted Headers\",{{\"Column1\", Int64.Type}, {\"Month\", Int64.Type}, {\"Day\", Int64.Type}, {\"Year\", Int64.Type}, {\"CaptureTime\", type text}, {\"ReleaseTime\", type text}, {\"BandNumber\", type text}, {\"Species\", type text}, {\"Age\", type text}, {\"Sex\", type text}, {\"Wing\", type number}, {\"Weight\", type text}, {\"Culmen\", type text}, {\"Hallux\", type text}, {\"Tail\", Int64.Type}, {\"StandardTail\", type text}, {\"Tarsus\", type text}, {\"WingPitFat\", type text}, {\"KeelFat\", type text}, {\"Crop\", type text}})<\/pre>\n<p>Notice that, in addition to the first column, several other columns have been converted to the <strong>Int64<\/strong> data type. However, the <em>Wing<\/em> column has been converted to the <strong>number<\/strong> type because it contains at least one decimal value. Other columns retained the <strong>text<\/strong> data type because they contained only character values or contained a mix of types.<\/p>\n<p>It should be noted that the <em>Changed Type<\/em> step just added to the <em>hawks<\/em> data set is actually named <em>Changed Type1<\/em>. If a step is added that is of the same type as a previous step, Query Editor tags a numerical value to onto the new step to differentiate its name from the earlier step.<\/p>\n<p>To complete the process of promoting the first row to headers, you might need to rename one or more columns. For example, because the <em>hawks<\/em> dataset did not include a name for the first column, change it from the auto-generated <em>column1<\/em> name to <em>TagID,<\/em> as shown in the following figure by right-clicking into the <em>column1<\/em> heading and selecting <em>Rename<\/em>.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1175\" height=\"558\" class=\"wp-image-79098\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-27.png\" \/><\/p>\n<p>As you can see, the <em>Applied Steps<\/em> section now includes the <em>Renamed Columns<\/em> step. Also notice that each column name is preceded by a symbol that indicates the column\u2019s data type. If you click the symbol, a context menu will appear and display a list of symbols and their meanings. You can also change the data type from this menu.<\/p>\n<p>Note, however, that the name of the data types listed here are not always the same as the names used in the M statements. For example, the <strong>Int64<\/strong> data type that appears in the M statements is listed as the <em>Whole Number<\/em> data type in the context menu.<\/p>\n<h2>Addressing Errors<\/h2>\n<p>When working in Query Editor, you might run into errors when converting data, modifying applied steps, or taking other actions. Often you won\u2019t realize that there is an error until you try to apply any changes you\u2019ve made. (You should be applying and saving your changes regularly when transforming data.)<\/p>\n<p>To apply your changes in Query Editor, click the <em>Close &amp; Apply<\/em> down arrow on the <em>Home<\/em> ribbon, and then click <em>Apply<\/em>. Any changes that you made since they last time you applied changes will be incorporated into the dataset, unless there is an error. For example, when you apply the changes to the <em>hawks<\/em> dataset after promoting the headers, you will receive the message shown in the following figure.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"517\" height=\"259\" class=\"wp-image-79099\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-28.png\" \/><\/p>\n<p>When you click the <em>View errors<\/em> link, Query Editor will isolate the row that contains the error, as shown in the following figure. The <em>TagID<\/em> value for this row is <em>263<\/em>. After viewing the error row, you should see that the <em>Wing<\/em> column contains an <em>NA<\/em> value, although the column is configured with the <strong>number<\/strong> data type, which is why you received the error.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1062\" height=\"233\" class=\"wp-image-79100\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-29.png\" \/><\/p>\n<p>Click hawks on the left to exit out of the <em>Errors<\/em> screen and get back to the Query Editor.<\/p>\n<p>To address an error in Query Editor, you can replace or remove the error or change the column\u2019s data type:<\/p>\n<ul>\n<li>\n<p>To change the column\u2019s type, click the type icon next to the column name, and then click <em>Text<\/em>.<\/p>\n<\/li>\n<li>\n<p>To replace the error, right-click the column header, click <em>Replace Errors,<\/em> type a replacement value, and then click <em>OK<\/em>.<\/p>\n<\/li>\n<li>\n<p>To filter out the row with the error, click the down arrow next to the <em>TagID<\/em> header, search for the <em>263,<\/em> and then clear the checkbox associated with this ID.<\/p>\n<\/li>\n<li>\n<p>To remove all rows that contain errors, click the table icon in the top left corner of the data set, and then click <em>Remove Errors<\/em>. This is the approach I took for the <em>hawks<\/em> dataset. As a result, the <em>Removed Errors<\/em> step was added to the <em>Applied Steps<\/em> section.<\/p>\n<\/li>\n<\/ul>\n<div class=\"note\">\n<p><em>NOTE: One thing interesting about the error is that it occurs as a result of promoting the first row to column headers and the subsequent type conversion that came with it. From what I can tell, Query Editor must use only a sample of the data when selecting the data type during this step. It does not appear to have anything to do with the <em>Data Type Detection<\/em> setting in the preview window. I tried all possible settings and the results were always the same.<\/em><\/p>\n<\/div>\n<h2>Removing Columns<\/h2>\n<p>In some cases, the data you import includes columns that you don\u2019t need for your reports. Query Editor makes removing these columns a simple and quick process:<\/p>\n<ul>\n<li>\n<p>To remove a single column, select the column directly in the displayed dataset, and then click the <em>Remove Columns<\/em> button on the <em>Home<\/em> ribbon, or you can instead right-click the column header and then click <em>Remove<\/em>.<\/p>\n<\/li>\n<li>\n<p>To remove multiple columns, select the first column, press and hold the <em>Control<\/em> key, select each additional column, and then click the <em>Remove Columns<\/em> button on the <em>Home<\/em> ribbon.<\/p>\n<\/li>\n<\/ul>\n<p>For the <em>hawks<\/em> dataset, remove the following columns because they are not relevant to the reports you might create or they contained numerous <em>NA<\/em> values:<\/p>\n<ul>\n<li>\n<p>CaptureTime<\/p>\n<\/li>\n<li>\n<p>ReleaseTime<\/p>\n<\/li>\n<li>\n<p>Sex<\/p>\n<\/li>\n<li>\n<p>Culmen<\/p>\n<\/li>\n<li>\n<p>Hallux<\/p>\n<\/li>\n<li>\n<p>Tail<\/p>\n<\/li>\n<li>\n<p>StandardTail<\/p>\n<\/li>\n<li>\n<p>Tarsus<\/p>\n<\/li>\n<li>\n<p>WingPitFat<\/p>\n<\/li>\n<li>\n<p>KeelFat<\/p>\n<\/li>\n<li>\n<p>Crop<\/p>\n<\/li>\n<\/ul>\n<p>After removing the columns, Query Editor adds a single <a id=\"post-79059-OLE_LINK1\"><\/a><a id=\"post-79059-OLE_LINK2\"><\/a><em>Removed Columns<\/em> step to the <em>Applied Steps<\/em> section, as shown in the following figure.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1340\" height=\"508\" class=\"wp-image-79101\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-30.png\" \/><\/p>\n<p>One of the advantages of having the <em>Removed Columns<\/em> step listed along with the other steps is that you can easily remove or modify the step if you later decide that one or more of the columns should not have been removed. In this case, you\u2019re relatively safe modifying this step because any steps that you might have added after deleting the columns cannot reference those columns because they\u2019re considered nonexistent. Close the Query Editor to go back to <em>Data<\/em> view.<\/p>\n<h2>Adding a Calculated Column<\/h2>\n<p>With Power BI Desktop, you can add calculated columns to a dataset that concatenates data or performs calculations. Power BI Desktop provides two methods for adding a calculated column. The first is to create the column in <em>Data<\/em> view, using the Data Analysis Expressions (DAX) language to define the column\u2019s logic.<\/p>\n<p>To create a DAX-based column, click the <em>New Column<\/em> button on the <em>Home<\/em> ribbon in <em>Data<\/em> view, and then type the DAX expression in the formula bar at the top of the dataset. The new column should be selected when adding the expression. After you type the expression, click the checkmark to the left of the expression to verify the syntax and populate the new column.<\/p>\n<p>For example, add a column named <em>Date<\/em> to the <em>hawks<\/em> dataset to provide a single value that shows the date when a hawk was tagged, using the following DAX expression:<\/p>\n<pre class=\"lang:none theme:none\">Date = DATE([Year], [Month], [Day])<\/pre>\n<p>The expression uses the <strong>DATE<\/strong> function to create a date value based on the <em>Year, Month,<\/em> and <em>Day<\/em> columns. After you add the <em>Date<\/em> column, you can set the displayed format. Go to the <em>Modeling<\/em> ribbon in <em>Data<\/em> view, click the <em>Format<\/em> drop-down list, point to <em>Date Time,<\/em> and click <em>3\/14\/2001 M\/d\/yyyy<\/em>. The following figure shows the new column after I added it to the dataset, with the applied format.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1062\" height=\"539\" class=\"wp-image-79102\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-31.png\" \/><\/p>\n<p>Using DAX to create a column in <em>Data<\/em> view is quick and easy. However, this approach has some limitations. For example, the column is not available in Query Editor, so it cannot be used as part of another calculated column definition in Query Editor. In addition, if you delete a column referenced by a DAX expression, the values in the DAX-based column will show only errors.<\/p>\n<p>The safest way to get around these limitations is to create your column in Query Editor. Delete the <em>Date<\/em> column you just added in the <em>Data<\/em> view. Click <em>Edit Queries<\/em> to reopen the Query Editor. Click the <em>Custom Column<\/em> button on the <em>Add Column<\/em> tab. When the <em>Custom Column<\/em> dialog box appears, type a name for the new column and the column\u2019s formula, which is an M expression that defines the column\u2019s logic. (Query Editor automatically adds the rest of the M syntax to create a complete statement.) For example, to create a column like the DAX-based column above, used the following M expression:<\/p>\n<pre class=\"lang:none theme:none\">= #date([Year], [Month], [Day])<\/pre>\n<p>The statement uses the <strong>#date<\/strong> method to create a date value based on the<em> Year, Month,<\/em> and <em>Day<\/em> columns, similar to the DAX expression. The following figure shows what the M expression looks like when entered into the <em>Custom Column<\/em> dialog box.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"714\" height=\"428\" class=\"wp-image-79103\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-32.png\" \/><\/p>\n<p>When entering your M expression in the <em>Custom Column<\/em> dialog box, be sure that the green arrow is showing at the bottom of the dialog box to ensure that there are no syntax errors. Be aware, however, that you can have errors in your M statement without them showing up as syntax errors.<\/p>\n<p>When you create a calculated column, Query Editor adds the <em>Added Custom<\/em> step to the <em>Applied Steps<\/em> section. You can then reference the new column in other calculated columns. You can also remove any of the columns referenced by the M expression. In this case, remove the <em>Month<\/em> and <em>Day<\/em> columns but keep the <em>Year<\/em> column in case you want it available for reports.<\/p>\n<h2>Splitting a Column<\/h2>\n<p>In Query Editor, you can split a column based on a specified value (delimiter) or by a specified number of characters. For example, the <em>hawks<\/em> dataset includes the <em>BandNumber<\/em> column, which is made up of two parts, separated by a hyphen. One of those parts might have special meaning, such as indicating the individual who tagged the hawk or the process used to tag the hawk. Splitting the column could make it easier to group the data by a particular entity.<\/p>\n<p>To split a column by a delimiter, right-click the column\u2019s header, point to <em>Split Column<\/em>, and then click <em>By<\/em> <em>Delimiter<\/em>. In the <em>Split<\/em> <em>Column<\/em> <em>by<\/em> <em>Delimiter<\/em> dialog box, specify the delimiter and then click <em>OK<\/em>. Query Editor will split the column into two columns (removing the delimiter) and update the data types if necessary. You can then rename the columns if you do not want to use the autogenerated names.<\/p>\n<p>The following figure shows the <em>hawks<\/em> dataset after splitting the <em>BandNumber<\/em> column and changing the names of the two new columns to <em>BandPrefix<\/em> and <em>BandSuffix<\/em>.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"950\" height=\"415\" class=\"wp-image-79104\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-33.png\" \/><\/p>\n<p>Notice that three steps have been added to the <em>Applied Steps<\/em> section: <em>Split Column by Delimiter, Changed Type2,<\/em> and <em>Renamed Columns1<\/em>.<\/p>\n<h2>Performing Additional Data Transformations<\/h2>\n<p>When you\u2019re preparing a dataset for reporting, you should evaluate the data types that have been assigned to each column to ensure they\u2019re what you need. Usually you\u2019ll want to hold off doing this until after you\u2019ve added, removed, or split columns so you\u2019re not introducing unnecessary transformation steps. Evaluating the data types can also point to possible anomalies in the data.<\/p>\n<p>For example, the <em>BandPrefix<\/em> column was assigned the <strong>number<\/strong> data type, which would have been expected to be the <strong>Int64<\/strong> data type. The same goes for the <em>Wing<\/em> column. In addition, the <em>Weight<\/em> column was assigned the <strong>text<\/strong> data type, and the <em>Date<\/em> column was assigned the <strong>Any<\/strong> data type. For this article, change the data types for the <em>BandPrefix, Wing<\/em>, and <em>Weight<\/em> columns to the <strong>Int64<\/strong> data type (whole number), and change the <em>Date<\/em> column to the <strong>date<\/strong> data type.<\/p>\n<div class=\"note\">\n<p><em>NOTE: I could not find a clear answer for why Query Editor assigned the <strong>number<\/strong> type to the <em>BandPrefix<\/em> column and the <strong>Int64<\/strong> type to the <em>BandSuffix<\/em> column. The only numeric values I found in the <em>BandPrefix<\/em> column were whole numbers. However, when researching this issue, I discovered that both columns contained null values, so I removed those rows. To filter out rows with null values, click the down arrow in the applicable column header, and clear the checkbox associated with the null value.<\/em><\/p>\n<\/div>\n<p>The reason the <em>Wing<\/em> column was assigned the <strong>number<\/strong> data type was because it contained a decimal value. When changing the data type to <strong>Int64<\/strong>, Query Editor automatically rounded the value to the nearest whole number, which worked fine for the purposes of this article.<\/p>\n<p>The process was not quite as simple for the <em>Weight<\/em> column. The reason it was assigned the <strong>text<\/strong> data type was because the column contained <em>NA<\/em> values, which were treated as errors when changing the column to the <strong>Int64<\/strong> data type. Remove these rows by using the <em>Remove Errors<\/em> option.<\/p>\n<p>The next step is to move the <em>Date<\/em> column left within the dataset so it appears before the <em>Year<\/em> column. To move a column in Query Editor, simply drag the column header to the new position. Query Editor will add the <em>Reordered Columns<\/em> step to the <em>Applied Steps<\/em> section, as shown in the following figure. The figure also shows the other steps that were added when changing data types, filtered rows, and removed errors.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1327\" height=\"581\" class=\"wp-image-79105\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-34.png\" \/><\/p>\n<p>The final step is to replace the values in the <em>Species<\/em> and <em>Age<\/em> columns to make them more readable. To replace a value, right-click the column header, and click <em>Replace Values<\/em>. In the <em>Replace Values<\/em> dialog box, type the original value and new value in the applicable text boxes. Next, click the <em>Advanced options<\/em> arrow, and select the <em>Match entire cell contents<\/em> checkbox. This is important to ensure that you don\u2019t replace part of another value.<\/p>\n<p>The first value to replace in the <em>hawks<\/em> data set is the <em>RT<\/em> value in the <em>Species<\/em> column. For the new value, used <em>Red-Tailed,<\/em> as shown in the following figure.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"714\" height=\"379\" class=\"wp-image-79106\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-35.png\" \/><\/p>\n<p>Also replace the following values:<\/p>\n<ul>\n<li>\n<p>The <em>CH<\/em> value in the <em>Species<\/em> column with <em>Cooper\u2019s<\/em>.<\/p>\n<\/li>\n<li>\n<p>The <em>SS<\/em> value in the <em>Species<\/em> column with <em>Sharp-Shinned<\/em>.<\/p>\n<\/li>\n<li>\n<p>The <em>A<\/em> value in the <em>Age<\/em> column with <em>Adult<\/em>.<\/p>\n<\/li>\n<li>\n<p>The <em>I<\/em> value in the <em>Age<\/em> column with <em>Immature<\/em>.<\/p>\n<\/li>\n<\/ul>\n<p>The following figure shows the updated values in the <em>Species<\/em> and <em>Age<\/em> columns, as the dataset appears in the <em>Data<\/em> view of the main Power BI Desktop window.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"961\" height=\"578\" class=\"wp-image-79107\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-36.png\" \/><\/p>\n<p>Notice that the columns are not impacted by the reordering that was done in Query Editor. The newer columns\u2014<em>Date, BandPrefix,<\/em> and <em>BandSuffix<\/em>\u2014are included at the right side of the dataset, after the original columns. Also notice that in this case the <em>Date<\/em> column is configured in the format <em>mm\/dd\/yyyy,<\/em> but you might see the data values spelled out or displayed in another format. However, you can choose any format from the <em>Format<\/em> drop-down list on the <em>Modeling<\/em> ribbon.<\/p>\n<h2>Generating Reports<\/h2>\n<p>Once you get your datasets in the format you want them, you can begin to create reports and their visualizations. Just to show you a sample of what can be done, I created one report with three visualizations based on the <em>hawks<\/em> data set, as shown in the following figure.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1346\" height=\"609\" class=\"wp-image-79108\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-37.png\" \/><\/p>\n<p>The figure contains a one-page report as it appears in <em>Report<\/em> view in Power BI Desktop. The report includes a table, bar chart, and donut chart, all based on the <em>hawks<\/em> dataset. Although this article doesn\u2019t focus on reports, be aware that <em>Report<\/em> view includes a variety of options for creating and configuring different types of visualizations. Later in the series, you\u2019ll learn how to create reports and visualizations and publish them to the Power BI service.<\/p>\n<h2>Working with Power BI Desktop<\/h2>\n<p>There\u2019s much more you can do with Power BI Desktop and Query Editor than what was covered here. In addition to creating various types of visualizations, you can merge and append queries, group and aggregate data, pivot columns, define relationships, run R scripts, and carry out a number of other tasks. The best way to become familiar with these features is to experiment with them against different types of data. The better you understand how to work with Power BI Desktop, the more effectively you can use the rest of the Power BI tools, including the Power BI service, Power BI Report Server and the Power BI mobile apps.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Power BI Desktop is the free tool from Microsoft for developing in Power BI. In this article, Robert Sheldon walks you through importing and manipulating data with the Query Editor in Power BI Desktop.&hellip;<\/p>\n","protected":false},"author":221841,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143528],"tags":[],"coauthors":[6779],"class_list":["post-79059","post","type-post","status-publish","format-standard","hentry","category-bi-sql-server"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/79059","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/221841"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=79059"}],"version-history":[{"count":14,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/79059\/revisions"}],"predecessor-version":[{"id":79301,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/79059\/revisions\/79301"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=79059"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=79059"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=79059"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=79059"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}