The Data Analysis Expressions (DAX) language includes a wide range of functions that help you refine your queries when retrieving data from a SQL Server Analysis Services (SSAS) tabular database. You’ve seen numerous examples of DAX functions throughout this series on the tabular model. One set of functions that are particularly useful when working with tabular data are the statistical functions, which support various ways to aggregate and analyze data. Statistical functions include not only the typical standbys, such as Min and Max, but also those that help structure that data into tables, such as AddColumns and Summarize.

In this article, the fifth in our series, we explore many of the DAX statistical functions and provide numerous examples that demonstrate how they work. As with the previous articles (one | two | three | four), the examples retrieve data from the AdventureWorks Tabular Model SQL 2012 tabular database, available as a SQL Server Data Tools (SSDT) tabular project from the AdventureWorks CodePlex site. I implemented the database on a local instance of SSAS 2012 in tabular mode and developed the examples against that database.

To follow along in this article, you should have a basic understanding of how to write a DAX query. The first two articles provide a good starting point if you’re new to DAX and SSAS tabular databases. Once you’ve read through those, you should have no problem applying the statistical functions we cover here.

Given the analytical nature of tabular databases, it’s not surprising that the statistical functions play such a key role; learning them is well worth the effort.

The Row Function

Often when retrieving and aggregating tabular data, you’ll want to return only a single row in order to get a quick overview of your data. That’s where the Row function comes in. It returns a single-row table made up of one or more columns. You can use the function in conjunction with other functions in order to view different types of information. For example, the following query uses the Sum and Format functions in conjunction with the Row function in order to return the total amount of sales from the Sales Amount column in the Internet Sales table:

evaluate

(

row

(

"Total Sales", format(sum('Internet Sales'[Sales Amount]), "Currency")

)

As you’ve seen in the previous articles, you must use the Evaluate statement to retrieve data from an SSAS tabular database. The statement takes one argument: a table or table expression. In this example, we use the Row function to create the table expression.

For each column you want the Row function to return, you must specify the column name in double quotes followed by an expression that defines the value to be returned, with a comma separating the two components. In the example above, the Row function returns only one column named Total Sales, and the column expression uses the Sum function to return the sales total from Sales Amount. The expression then uses the Format function to return the value in a standard currency format (in US dollars). The following table shows the results returned by the Evaluate statement:

Total Sales

$29,358,677.22

If you want to return more than one column, when using the Row function, you must include a name/expression pair for each one, separated by a comma, as shown in the following example:

evaluate

(

row

(

"Total Sales", format(sum('Internet Sales'[Sales Amount]), "Currency"),

"Average Sales", format(average('Internet Sales'[Sales Amount]), "Currency"),

"Highest Sale", format(max('Internet Sales'[Sales Amount]), "Currency"),

"Lowest Sale", format(min('Internet Sales'[Sales Amount]), "Currency")

)

Each column definition follows the same structure: column name followed by column expression. In this case, the column definitions differ only in the column name and the statistical function being applied. The Average function returns the average sales amount, the Max function return the highest sales amount, and the Min function returns the lowest. As the following results show, the Evaluate statement now returns four columns, but still one row:

Total Sales	Average Sales	Highest Sale	Lowest Sale
$29,358,677.22	$486.09	$3,578.27	$2.29

When using a statistical function such as Average, it can also be useful to know how many rows are being evaluated. The following example includes both the Count and DistinctCount statistical functions:

evaluate

(

row

(

"Total Sales", format(sum('Internet Sales'[Sales Amount]), "Currency"),

"Average Sales", format(average('Internet Sales'[Sales Amount]), "Currency"),

"Count", count('Internet Sales'[Sales Amount]),

"Distinct Count", distinctcount('Internet Sales'[Sales Amount])

)

The Count function returns the total number of values in the specified column, and the DistinctCount function returns the number of unique values in that column. The Count function does not work for string values, only number and date columns. Because we’re using the function on the Sales Amount column, it works fine because the column contains numerical data only. The following table shows the results now returned by the Evaluate statement:

Total Sales	Average Sales	Count	Distinct Count
$29,358,677.22	$486.09	60398	42

When working with the Count function, be aware that it counts only existing values. If a row is missing a value, that row is not counted. To get a count of all the rows in a table, you must use the CountRows statistical function, as shown in the following example:

evaluate

(

row

(

"Average Cost", format(average('Product'[Standard Cost]), "Currency"),

"Count", count('Product'[Standard Cost]),

"Distinct Count", distinctcount('Product'[Standard Cost]),

"Row Count", countrows('Product')

)

In this case, the StandardCost column in the Product table contains a number of empty cells, so a count based on that column returns a number less than the number of rows in the table, but the CountRows functions includes all rows, as the following results show:

Average Cost	Count	Distinct Count	Row Count
$434.27	395	135	606

As you’ve seen, the Row function can be useful whenever you need to return only a single row, which is often the case when aggregating data. However, the other statistical functions we’ve looked at are by no means limited to the Row function. For example, you can use them with the Summarize function when grouping data. Nevertheless, more often than not you’ll find that if you use the Row function, you’ll be using the aggregate-type statistical functions as well.

The Topn Function

The Topn function limits the rows returned by a query to the top number of rows specified when you call the function, similar to how a TOP expression works in T-SQL. When you use the Topn function, you must include three parameters. The first specifies the number of rows to return. The second is the table or table expression from which you’re extracting the top rows. And the third is an expression that specifies the order in which the rows in the table should be sorted before returning the top rows. Usually this is one or more columns in the target table.

For example, suppose you want to return the first five rows of the Sales Territory table, with the data sorted by the SalesTerritoryKey column. The following example shows how to call the Topn function and pass in the parameter values:

evaluate

(

topn(5, 'Sales Territory', [SalesTerritoryKey])

)

Once again, we start with an Evaluate statement and specify the Topn function as the statement’s table expression. Into the function, we pass in the three required parameter values, which specify that the top five rows should be returned from the Sales Territory table, sorted by the SalesTerritoryKey column. The Evaluate statement returns the following data:

SalesTerritoryKey	Sales Territory Region	Sales Territory Country	Sales Territory Group	Sales Territory Image
7	France	France	Europe	System.Byte[]
9	Australia	Australia	Pacific	System.Byte[]
8	Germany	Germany	Europe	System.Byte[]
10	United Kingdom	United Kingdom	Europe	System.Byte[]
11	NA	NA	NA	System.Byte[]

One thing that might seem odd about the results is that they return rows with the SalesTerritoryKey values of 7 through 11. If you’re familiar with the data, you know that the first value is 1. We’ll get into more details about this in a bit, but for now, know that the Topn function, by default, sorts the data in descending order, unlike most sort operations you’ll come across.

Another odd aspect of these results is that they’re not ordered according to the SalesTerritoryKey column, as you might expect. Although you have to specify the order when calling the Topn function, that order applies only when determining which rows to return, not when ordering the result set. For that, you need to add an OrderBy clause that specifies the SalesTerritoryKey column, as shown in the following example:

evaluate

(

topn(5, 'Sales Territory', [SalesTerritoryKey])

)

order by 'Sales Territory'[SalesTerritoryKey]

Now our statement will return the data in the specified order, as the following results show:

SalesTerritoryKey	Sales Territory Region	Sales Territory Country	Sales Territory Group	Sales Territory Image
7	France	France	Europe	System.Byte[]
8	Germany	Germany	Europe	System.Byte[]
9	Australia	Australia	Pacific	System.Byte[]
10	United Kingdom	United Kingdom	Europe	System.Byte[]
11	NA	NA	NA	System.Byte[]

The Topn function actually supports a fourth parameter, which determines whether the rows are sorted in ascending or descending order. As mentioned above, by default, they’re sorted in descending order. However, you can specify ascending or descending my adding a 0 for descending or a 1 for ascending. For example, the following Evaluate statement returns the top five rows in ascending order:

evaluate

(

topn(5, 'Sales Territory', [SalesTerritoryKey], 1)

)

prder by 'Sales Territory'[SalesTerritoryKey]

The only difference between this example and the previous one is the addition of the fourth parameter for the Topn function. Now the statement returns the first five rows in the table, as shown in the following results:

SalesTerritoryKey	Sales Territory Region	Sales Territory Country	Sales Territory Group	Sales Territory Image
1	Northwest	United States	North America	System.Byte[]
2	Northeast	United States	North America	System.Byte[]
3	Central	United States	North America	System.Byte[]
4	Southwest	United States	North America	System.Byte[]
5	Southeast	United States	North America	System.Byte[]

You can also use the Topn function in conjunction with other functions. For example, the following Evaluate statement returns the top five rows from the table returned by the Summarize function:

evaluate

(

topn(5,

summarize

(

'Internet Sales',

'Product'[Product Name],

'Date'[Calendar Year],

"Sales Amount", format(sum('Internet Sales'[Sales Amount]), "currency")

[Sales Amount], 0)

)

order by [Sales Amount] desc

In the previous examples, I specified the SalesTerritory table as the second parameter of the Topn function, but this time around I’m using the Summarize function as the table expression. The function returns the Product Name and Calendar Year columns, along with a calculated column that provides a sales total for each group of data. I then use the Topn function to return only the first five rows of the data returned by the Summarize function. Notice that I specify the calculated column (SalesAmount) as the sort column, in descending order. The following table shows the results returned by the Evaluate statement:

Product Name	Calendar Year	Sales Amount
Mountain-100 Silver, 38	2006	$98,599.71
Mountain-100 Silver, 38	2005	$98,599.71
Mountain-100 Black, 44	2005	$97,874.71
Mountain-100 Black, 38	2006	$94,499.72
Mountain-100 Black, 42	2006	$91,124.73

As you can see, the data now includes only the three columns specified in the Summarize function, with the data sorted in descending order, based on the SalesAmount column.

The Rollup Function

You’ve already gotten a taste of the Rollup function in the second article of this series. This time around, however, we’ll go into the function a little deeper and look at a couple other functions associated with Rollup.

The Rollup function is used in conjunction with the Summarize function. The Summarize function groups data together by the columns specified when you call the function. The function then returns a table that can be used by other functions, wherever a parameter requires a table value. You can also use the function as the table expression in an Evaluate statement.

The Summarize function supports the use of the Rollup function as part of the Summarize definition. The Rollup function adds roll-up rows to the returned table based on the columns used to group the data. The roll-up rows provide an additional layer of aggregation, above what you’ve already defined in your Summarize function. Let’s look at example to better understand how this works. In the following Evaluate statement, we use the Summarize function as the statement’s table expression:

evaluate

(

summarize

(

calculatetable

(

'Internet Sales',

'Product Category'[Product Category Name] = "Accessories"

'Product'[Product Name],

rollup('Date'[Calendar Year]),

"Total Sales", format(sum('Internet Sales'[Sales Amount]), "currency")

)

order by

'Product'[Product Name],

'Date'[Calendar Year]

The first parameter in the Summarize function requires its own table or table expression as its value. In this case, I’ve used a CalculateTable function to create that expression. The function returns the Internet Sales table, but limits the rows returned to those with a Product Category Name value equal to Accessories.

The second parameter value in the Summarize function specifies the Product Name column, which means the data will be grouped based on the values in that column.

The third value specifies the Calendar Year column. As a result, each product will be grouped by the year it was sold. However, I’ve also included the Rollup function in this expression. That means an additional row will be returned for each product that rolls up the data into a total for all years.

The last parameter value in the Summarize function is a calculated column named Total Sales. The column uses the Sum aggregate function to total the Sales Amount values for each grouping. I’ve also used the Format function to return the data in a currency format. The following table shows the results returned by the Evaluate statement:

Product Name	Calendar Year	Total Sales
All-Purpose Bike Stand		$39,591.00
All-Purpose Bike Stand	2007	$18,921.00
All-Purpose Bike Stand	2008	$20,670.00
Bike Wash – Dissolver		$7,218.60
Bike Wash – Dissolver	2007	$3,044.85
Bike Wash – Dissolver	2008	$4,173.75
Fender Set – Mountain		$46,619.58
Fender Set – Mountain	2007	$19,408.34
Fender Set – Mountain	2008	$27,211.24
Hitch Rack – 4-Bike		$39,360.00
Hitch Rack – 4-Bike	2007	$16,440.00
Hitch Rack – 4-Bike	2008	$22,920.00
HL Mountain Tire		$48,860.00
HL Mountain Tire	2007	$20,300.00
HL Mountain Tire	2008	$28,560.00
HL Road Tire		$27,970.80
HL Road Tire	2007	$12,877.00
HL Road Tire	2008	$15,093.80
Hydration Pack – 70 oz.		$40,307.67
Hydration Pack – 70 oz.	2007	$16,771.95
Hydration Pack – 70 oz.	2008	$23,535.72

As with the previous examples, the results shown here are only part of the returned data, but you can see that totals are returned for each product sold and year it was sold. You can also see the roll-up row for each product-the rows without a Calendar Year value.

There might be times when you want to mark in some way when a row is a roll-up, other than just depending on blank values in the roll-up column. DAX includes the IsSubtotal function that you can use in conjunction with the Rollup function to create a column that indicates when a row is a roll-up. The IsSubtotal function returns a value of True if the row is a roll-up. Otherwise, the function returns False. The following example demonstrates how to use the IsSubtotal function by creating an additional column named IsRollup:

evaluate

(

summarize

(

calculatetable

(

'Internet Sales',

'Product Category'[Product Category Name] = "Accessories"

'Product'[Product Name],

rollup('Date'[Calendar Year]),

"Total Sales", format(sum('Internet Sales'[Sales Amount]), "currency"),

"Is Rollup", IsSubtotal('Date'[Calendar Year])

)

order by

'Product'[Product Name],

'Date'[Calendar Year]

The Evaluate statement is the same as the one shown in the preceding example, except for the addition of the Is Rollup column. Notice that the expression that defines the column includes the IsSubtotal function. When calling the function, you need to include the name of the column specified in the Rollup function, in this case the Calendar Year column. The Evaluate statement returns the following results:

Product Name	Calendar Year	Total Sales	Is Rollup
All-Purpose Bike Stand		$39,591.00	True
All-Purpose Bike Stand	2007	$18,921.00	False
All-Purpose Bike Stand	2008	$20,670.00	False
Bike Wash – Dissolver		$7,218.60	True
Bike Wash – Dissolver	2007	$3,044.85	False
Bike Wash – Dissolver	2008	$4,173.75	False
Fender Set – Mountain		$46,619.58	True
Fender Set – Mountain	2007	$19,408.34	False
Fender Set – Mountain	2008	$27,211.24	False
Hitch Rack – 4-Bike		$39,360.00	True
Hitch Rack – 4-Bike	2007	$16,440.00	False
Hitch Rack – 4-Bike	2008	$22,920.00	False
HL Mountain Tire		$48,860.00	True
HL Mountain Tire	2007	$20,300.00	False
HL Mountain Tire	2008	$28,560.00	False
HL Road Tire		$27,970.80	True
HL Road Tire	2007	$12,877.00	False
HL Road Tire	2008	$15,093.80	False
Hydration Pack – 70 oz.		$40,307.67	True
Hydration Pack – 70 oz.	2007	$16,771.95	False
Hydration Pack – 70 oz.	2008	$23,535.72	False

The result set now includes the IsRollup column and is populated with True and False values, with each roll-up row receiving a value of True.

Now suppose that instead of grouping our data by product and then year sold, we group it by product category, then product subcategory, and finally the year sold, as shown in the following example:

evaluate

(

summarize

(

'Internet Sales',

'Product Category'[Product Category Name],

'Product Subcategory'[Product Subcategory Name],

rollup('Date'[Calendar Year]),

"Total Sales", format(sum('Internet Sales'[Sales Amount]), "currency"),

"Is Rollup", IsSubtotal('Date'[Calendar Year])

)

order by

'Product Category'[Product Category Name],

'Product Subcategory'[Product Subcategory Name],

'Date'[Calendar Year]

As in the previous examples, we’re rolling up our data based on the CalendarYear column. Consequently, the statement will provide a roll-up row for each category/subcategory grouping, as shown in the following results:

Product Category Name	Product Subcategory Name	Calendar Year	Total Sales	Is Rollup
Accessories	Bike Racks		$39,360.00	True
Accessories	Bike Racks	2007	$16,440.00	False
Accessories	Bike Racks	2008	$22,920.00	False
Accessories	Bike Stands		$39,591.00	True
Accessories	Bike Stands	2007	$18,921.00	False
Accessories	Bike Stands	2008	$20,670.00	False
Accessories	Bottles and Cages		$56,798.19	True
Accessories	Bottles and Cages	2007	$23,280.27	False
Accessories	Bottles and Cages	2008	$33,517.92	False
Accessories	Cleaners		$7,218.60	True
Accessories	Cleaners	2007	$3,044.85	False
Accessories	Cleaners	2008	$4,173.75	False
Accessories	Fenders		$46,619.58	True
Accessories	Fenders	2007	$19,408.34	False
Accessories	Fenders	2008	$27,211.24	False
Accessories	Helmets		$225,335.60	True
Accessories	Helmets	2007	$92,583.54	False
Accessories	Helmets	2008	$132,752.06	False
Accessories	Hydration Packs		$40,307.67	True
Accessories	Hydration Packs	2007	$16,771.95	False
Accessories	Hydration Packs	2008	$23,535.72	False
Accessories	Tires and Tubes		$245,529.32	True
Accessories	Tires and Tubes	2007	$103,259.76	False
Accessories	Tires and Tubes	2008	$142,269.56	False
Bikes	Mountain Bikes		$9,952,759.56	True
Bikes	Mountain Bikes	2005	$585,973.27	False
Bikes	Mountain Bikes	2006	$1,562,456.76	False
Bikes	Mountain Bikes	2007	$3,989,638.48	False
Bikes	Mountain Bikes	2008	$3,814,691.06	False

This level of roll-up is probably fine for some of your needs, but at times, you might find it handy to return a roll-up row for each category, as well as each category/subcategory grouping. The easiest way to do that is to add the ProductSubcategoryName column to the Rollup function, as shown in the following example:

evaluate

(

summarize

(

'Internet Sales',

'Product Category'[Product Category Name],

rollup('Product Subcategory'[Product Subcategory Name], 'Date'[Calendar Year]),

"Total Sales", format(sum('Internet Sales'[Sales Amount]), "currency"),

"Is Rollup", IsSubtotal('Date'[Calendar Year])

)

order by

'Product Category'[Product Category Name],

'Product Subcategory'[Product Subcategory Name],

'Date'[Calendar Year]

When adding an additional column to the Rollup function, you must separate the column names with a comma, but that’s all you’ll need to do. As the following table shows, the results now include a roll-up row for each product category:

Product Category Name	Product Subcategory Name	Calendar Year	Total Sales	Is Rollup
Accessories			$700,759.96	True
Accessories	Bike Racks		$39,360.00	True
Accessories	Bike Racks	2007	$16,440.00	False
Accessories	Bike Racks	2008	$22,920.00	False
Accessories	Bike Stands		$39,591.00	True
Accessories	Bike Stands	2007	$18,921.00	False
Accessories	Bike Stands	2008	$20,670.00	False
Accessories	Bottles and Cages		$56,798.19	True
Accessories	Bottles and Cages	2007	$23,280.27	False
Accessories	Bottles and Cages	2008	$33,517.92	False
Accessories	Cleaners		$7,218.60	True
Accessories	Cleaners	2007	$3,044.85	False
Accessories	Cleaners	2008	$4,173.75	False
Accessories	Fenders		$46,619.58	True
Accessories	Fenders	2007	$19,408.34	False
Accessories	Fenders	2008	$27,211.24	False
Accessories	Helmets		$225,335.60	True
Accessories	Helmets	2007	$92,583.54	False
Accessories	Helmets	2008	$132,752.06	False
Accessories	Hydration Packs		$40,307.67	True
Accessories	Hydration Packs	2007	$16,771.95	False
Accessories	Hydration Packs	2008	$23,535.72	False
Accessories	Tires and Tubes		$245,529.32	True
Accessories	Tires and Tubes	2007	$103,259.76	False
Accessories	Tires and Tubes	2008	$142,269.56	False
Bikes			$28,318,144.65	True
Bikes	Mountain Bikes		$9,952,759.56	True
Bikes	Mountain Bikes	2005	$585,973.27	False
Bikes	Mountain Bikes	2006	$1,562,456.76	False
Bikes	Mountain Bikes	2007	$3,989,638.48	False
Bikes	Mountain Bikes	2008	$3,814,691.06	False

The Summarize function supports one other related function worth mentioning: RollupGroup. When used in conjunction with the Rollup function, the RollupGroup function prevents partial subtotals from being included in the result set. Let’s look at another example to demonstrate how this works. In the following Evaluate statement, I’ve added the RollupGroup function as part of the Rollup function:

evaluate

(

summarize

(

'Internet Sales',

'Product Category'[Product Category Name],

rollup(rollupgroup('Product Subcategory'[Product Subcategory Name], 'Date'[Calendar Year])),

"Total Sales", format(sum('Internet Sales'[Sales Amount]), "currency"),

"Is Rollup", IsSubtotal('Date'[Calendar Year])

)

order by

'Product Category'[Product Category Name],

'Product Subcategory'[Product Subcategory Name],

'Date'[Calendar Year]

The column names are passed into the RollupGroup function, and that function is passed to the Rollup function. As to be expected, the statement now returns slightly different results:

Product Category Name	Product Subcategory Name	Calendar Year	Total Sales	Is Rollup
Accessories			$700,759.96	True
Accessories	Bike Racks	2007	$16,440.00	False
Accessories	Bike Racks	2008	$22,920.00	False
Accessories	Bike Stands	2007	$18,921.00	False
Accessories	Bike Stands	2008	$20,670.00	False
Accessories	Bottles and Cages	2007	$23,280.27	False
Accessories	Bottles and Cages	2008	$33,517.92	False
Accessories	Cleaners	2007	$3,044.85	False
Accessories	Cleaners	2008	$4,173.75	False
Accessories	Fenders	2007	$19,408.34	False
Accessories	Fenders	2008	$27,211.24	False
Accessories	Helmets	2007	$92,583.54	False
Accessories	Helmets	2008	$132,752.06	False
Accessories	Hydration Packs	2007	$16,771.95	False
Accessories	Hydration Packs	2008	$23,535.72	False
Accessories	Tires and Tubes	2007	$103,259.76	False
Accessories	Tires and Tubes	2008	$142,269.56	False
Bikes			$28,318,144.65	True
Bikes	Mountain Bikes	2005	$585,973.27	False
Bikes	Mountain Bikes	2006	$1,562,456.76	False
Bikes	Mountain Bikes	2007	$3,989,638.48	False
Bikes	Mountain Bikes	2008	$3,814,691.06	False

A roll-up row is included for each category, but not for the subcategories. As you can see, DAX provides a number of ways to roll up your data so your results include exactly the information you need.

The Crossjoin Function

The last function that we’ll look at is Crossjoin. You use a Crossjoin function to join two or more tables in order to return a Cartesian product, a result set in which each row in each table is joined with each row in each other table. The result set is the product of the number of rows from each table. For example, if one table contains five rows and a second table contains seven rows, the Cartesian product for those two tables will contain 35 rows.

This will all become clearer as we work through our examples. But first, take a look at the following Evaluate statement, in which we retrieve data from the Internet Sales table:

evaluate

(

summarize

(

'Internet Sales',

'Product Category'[Product Category Name],

'Date'[Calendar Year],

"Total Sales", format(sum('Internet Sales'[Sales Amount]), "currency")

)

order by

'Product Category'[Product Category Name],

'Date'[Calendar Year]

The statement contains nothing that you haven’t seen before. It groups data based on the ProductCategory and CalendarYear columns and provides the sales totals for each group, as shown in the following results:

Product Category Name	Calendar Year	Total Sales
Accessories	2007	$293,709.71
Accessories	2008	$407,050.25
Bikes	2005	$3,266,373.66
Bikes	2006	$6,530,343.53
Bikes	2007	$9,359,102.62
Bikes	2008	$9,162,324.85
Clothing	2007	$138,247.97
Clothing	2008	$201,524.64

The Evaluate statement returns just what we’d expect. However, what’s important about these results are not what they include, but what is missing. For example, the Product Category column in the source table includes the value Components, but that value isn’t included in the results because no sales are associated with that category. In addition, years are displayed only if they have sales associated with them. For instance, the results include information about the Accessories category for 2007 and 2008, but no other years are listed.

Often, we’ll want our results to be more complete than what we’re able to generate with a simple Summarize function and for that we need the Crossjoin function. It returns a Cartesian product, and we can use that data to generate more complete results.

Let’s look at an example to give you an idea of how the Crossjoin function works. In the following Evaluate statement, I use the Crossjoin function as the statement’s table expression:

evaluate

(

crossjoin

(

values('Product Category'[Product Category Name]),

values('Date'[Calendar Year])

)

order by

'Product Category'[Product Category Name],

'Date'[Calendar Year]

I pass two arguments into the Crossjoin function. Each argument must return a table. In this case, I use the Values function for each argument in order to return a one-column table of distinct values. The first Values function returns a list of values from the Product Category Name column, and the second Values function returns a list of values from the Calendar Year column. (These are the same two columns used to group data in the previous example.) The Evaluate statement returns the results shown in the following table:

Product Category Name	Calendar Year
	2005
	2006
	2007
	2008
	2009
	2010
Accessories	2005
Accessories	2006
Accessories	2007
Accessories	2008
Accessories	2009
Accessories	2010
Bikes	2005
Bikes	2006
Bikes	2007
Bikes	2008
Bikes	2009
Bikes	2010
Clothing	2005
Clothing	2006
Clothing	2007
Clothing	2008
Clothing	2009
Clothing	2010
Components	2005
Components	2006
Components	2007
Components	2008
Components	2009
Components	2010

Each row in each table has been joined with each row in the other table. For example, the Accessories value in the Product Category Name column is matched with each year in the Calendar Year column. The reason that there are blank values in the Product Category Name column is because the source data-the Product Category table in the tabular database-includes a blank member, and that blank member is matched with each year, just like the other Product Category Name values. Because the Product Category column contains five distinct values, and the Calendar Year column contains six distinct values, our query returns 30 rows.

Now that we know how to use the Crossjoin function to produce the Cartesian product, we can include it in our Evaluate statement to return more meaningful data. The following example uses the Crossjoin function as the table expression within the Summarize function:

evaluate

(

summarize

(

crossjoin

(

values('Product Category'[Product Category Name]),

values('Date'[Calendar Year])

'Product Category'[Product Category Name],

'Date'[Calendar Year],

"Total Sales", format(sum('Internet Sales'[Sales Amount]), "currency")

)

order by

'Product Category'[Product Category Name],

'Date'[Calendar Year]

The Summarize function is now working with a Cartesian product as its table. We can then group the data based on the ProductCategoryName column and the CalendarYear column, both of which are including in the Cartesian product. Now the Evaluate statement returns the following results:

Product Category Name	Calendar Year	Total Sales
	2005
	2006
	2007
	2008
	2009
	2010
Accessories	2005
Accessories	2006
Accessories	2007	$293,709.71
Accessories	2008	$407,050.25
Accessories	2009
Accessories	2010
Bikes	2005	$3,266,373.66
Bikes	2006	$6,530,343.53
Bikes	2007	$9,359,102.62
Bikes	2008	$9,162,324.85
Bikes	2009
Bikes	2010
Clothing	2005
Clothing	2006
Clothing	2007	$138,247.97
Clothing	2008	$201,524.64
Clothing	2009
Clothing	2010
Components	2005
Components	2006
Components	2007
Components	2008
Components	2009
Components	2010

As you can see, all the data is included in the results, whether or not any sales were made for a particular category in a particular year. This way, you know you’re returning the complete results, which can often give you a better picture of the data-and point to where potential problems might exist.

DAX Statistical Functions

You should now have a good sense of how to use the Row, Topn, Rollup, and Crossjoin functions, along with several of the aggregate statistical functions. Once you know the basics of how to use them, you should be able to apply them to other situations. DAX also supports a number of other statistical functions, as well as additional functions types, such as date and time functions, filter functions, and logical functions. Be sure to check out the TechNet article “DAX Function Reference” for information about those other functions as well as additional details about the functions we’ve covered.

Register for Simple Talk

DAX Statistical Functions

The Row Function

The Topn Function

The Rollup Function

The Crossjoin Function

DAX Statistical Functions

Article tags

About the author

Robert Sheldon

Robert's contributions

Articles

Books

Top topics

Robert's latest contributions:

Getting started with Azure Database for PostgreSQL Flexible Server

Moving from SQL to MongoDB’s MQL

Working with the MongoDB Database Tools

The Row Function

The Topn Function

The Rollup Function

The Crossjoin Function

DAX Statistical Functions

Article tags

Recommended

About the author

Robert Sheldon

Robert's contributions

Articles

Books

Top topics

Robert's latest contributions:

Getting started with Azure Database for PostgreSQL Flexible Server

Moving from SQL to MongoDB’s MQL

Working with the MongoDB Database Tools