SQL Server BIT_COUNT and an Alternative

Comments 0

Share to social media

I was editing an article the other day that uses the BIT_COUNT function that was added to SQL Server 2022. The solution presented is excellent, and I will try to come back and link to it here when I remember. (It will be linked the other way.

Reading that did two things for me. First it gave me an idea of how the BIT_COUNT function might be actually be used in a useful way. The solution that was presented would only work in SQL Server 2022 (It will work in earlier compatibility levels, based on the tests I have done.)

Bitwise operations, really?

As a person who is BIG into relational design and getting a well normalized database, let’s start here. There is rarely a reason to use the bits of an integer value for any reason. I have argued with people plenty about this sort of thing, like when someone wanted to create a multi-valued domain table like the following example.

Note that this isn’t an optimal design, so I am not advocating this for any new build. But it does happen, including places like SQL Server’s system tables.

Now you just load up the AttributeType table with values that have primary key of the values that use a single bit in an integer:

Then, you have a multi-valued relationship with one less table. You can’t join to it conveniently, but it does work. For example:

Now, you have a Person row, and you can join to the attributes in this way:

Which returns:

And you can see that this person has two attributes with BIT_COUNT:

This returns:

Now, of course I am not advocating this solution at all. There have been a few articles here on Simple Talk that made nice use of bits, for example Aaron Bertrand’s Days of the Week article. But as a rule, this is not really a good design pattern, mostly for the reason that the join I created is very cryptic for 90% of people (and I didn’t exactly pull it out of active memory myself!).

But what I did want to do is fix is if you are using a pre-SQL Server 2022 version of SQL Server and want to count the bits in an integer value, you can still use the technique from the forthcoming article I mentioned at the start of this article.

The way this is done is basically shown in the join criteria. You and every single bit binary value, and when it is equal to the value you anded, you add one to the output. While the values you use are only the ones with a single bit, but it can also be used to find items with multiple bits. Like if you wanted to find the Person rows where they only have Attributes 1 and 4, you would just look for rows where Attributes = 9. But if you want to look for rows with at least those two:

And even if you updated it to 11:

The SELECT statement will still return ‘John’, and since you will still have the 1 and 8 bits set and now the SELECT with the JOIN from earlier in the chapter will return all three values:

And BIT_COUNT will return 3.

The real point here is

Before I go too far and make this an article that looks like it is advocating for multi-valued, poorly normalized, bit based relationships… let’s get to the point. Counting bits.

If BIT_COUNT isn’t available, you can use something like this:

Which will return 3, just like BIT_COUNT will. And you can extend this pretty easily to cover all integers by doing all POWER(2) values that will fit in an integer. This is tedious to type, so I will do this with a bit of code.

Note: I am using the GENERATE_SERIES function that also was added to SQL Server 2022, but you can generate a numbers table using code like this one from Greg Larsen a few years ago if you are on an earlier version.

Now, copy that code and make the following function. The part after the --copied code is what i generated, including all the bits available.

I added an extra line to handle negative values. I would be wary of using that special bit using any method that is giving meaning to the bits, but it is available as a 32nd bit and counted by the BIT_COUNT function:

Now, you should be able to use this for most of your bit counting uses just like BIT_COUNT.

You could extend it for any type of uses but I will submit to you that if you find yourself doing bitwise operations for any reason other than legacy data/designs/or increasing processing performance with several orders of magnitude performance gains (like eliminating necessary loops in code that are executed many times over), you should rethink your plans.

Testing your solution

Like any code experiment, one last thing to do, make sure you have tested the code. This code can be tested like this using a SQL Server 2022 instance, though I wouldn’t try to run this on your laptop if you are on battery.

It is doing over 4 billion iterations of those functions and on my 4 core laptop with 11th Gen Intel Core i7-1185G7 @ 3.00GHz CPU (that is also running Discord, Slack, and especially Word as I edit this to a postable state while I wait for it to finish), it took “going out to lunch” time to complete.

The CPU graph was fun to watch while this was running. It is not doing any parallel operations. I will show you the plan for a more performance oriented test in a bit.

For me, this returned: 5286682 milliseconds which is and, as SSMS shows me, over 128 minutes and 7 seconds. I mean, I did ask it to do that function and the BIT_COUNT function over 4 billion times each, so I am pretty happy with that.

Unlike most code, because I have tested every single possible value that can be entered, this code is actually verified correct for any input that is in the allowable range. Most functions cannot be tested this well, and to be fair, you wouldn’t want to use my test as your unit test either, since it takes so long.

When this returns 0 rows, you know it works like it should.

Which version is faster?

One of the only reasons (other than being stuck in a legacy version) you typically say “use your own function instead of a system function” is performance. The FORMAT function is one of those cases that Aaron Bertrand covered a few years ago on this SQL Performance post.

So while it is established that this function works correctly, and 100% better than not having one built in in previous versions, in this section, I want to run a quick test to see which one performs better or worse.

To do this, I will run the functions a few times on a fairly large set of data. Whichever one was the fastest, I ran it before and after the slower one, to make sure there was nothing weird about the test based on ordering. (I also altered the order a few times before deciding to accept my results.

On my machine, these queries executed in considerably different times, and the system function blew it away by an order of magnitude.

Mind you, these are milliseconds for calculating the value a million times. And 1.4 seconds versus .4 seconds would be a win for the system function, even if it was the slower of the two! But it definitely is faster using the system function.

In Conclusion

I started this experiment because of the usage of BIT_COUNT in an article that I also felt would be useful in previous versions of SQL Server and to be honest, today is the first Saturday of the year with no football, so I had some time on my hands.

I think the function: BitCount_Int would be a good enough replacement for BIT_COUNT in your earlier systems that should work in reasonable version of SQL Server.

And remember, flipping and counting bits should be reserved for only the most time sensitive, performance tuning operations. It makes code hard to understand, hard to use, and so many more negatives. But if you find a solution that executes 100- or 1000-times shorter execution time than not using it… I can’t fight you on that, though I still will try.

Article tags

Load comments

About the author

Louis Davidson

Simple Talk Editor

See Profile

Louis is the editor of this Simple-Talk website. Prior to that, has was a corporate database developer and data architect for a non-profit organization for 25 years! Louis has been a Microsoft MVP since 2004, and is the author of a series of SQL Server Database Design books, most recently Pro SQL Server Relational Database Design and Implementation.