{"id":84184,"date":"2019-05-09T17:46:49","date_gmt":"2019-05-09T17:46:49","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=84184"},"modified":"2021-04-26T16:50:48","modified_gmt":"2021-04-26T16:50:48","slug":"the-phantom-menace-in-unit-testing","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/devops\/testing\/the-phantom-menace-in-unit-testing\/","title":{"rendered":"The Phantom Menace in Unit Testing"},"content":{"rendered":"<p>Let me state up front that this is <em>not<\/em> a rant about unit testing; unit tests are critically important elements of a robust and healthy software implementation. Instead, it is a cautionary tale about a small class of unit tests that may deceive you by seeming to provide test coverage but failing to do so. I call this class of unit tests <strong>phantom tests<\/strong> because they return what are, in fact, correct results but not necessarily because the system-under-test (SUT) is doing the right thing or, indeed, doing <em>anything<\/em>.<\/p>\n<p>In these cases, the SUT \u201cnaturally\u201d returns the expected value, so doing (a) the correct thing, (b) something unrelated, or even (c) nothing, would still yield a passing test. If the SUT is doing (b) or (c), then it follows that the test is adding no value. Moreover, I submit that the presence of such tests is often deleterious, making you worse off than not having them because you think you have coverage when you do not. When you then go to make a change to the SUT supposedly covered by that test, and the test still passes, you might blissfully conclude that your change did not introduce any bugs to the code, so you go on your merry way to your next task. In actuality, you simply do not know if you introduced any bugs because your test (or tests) are not reporting valid information.<\/p>\n<h2>Invoking Some Spirits<\/h2>\n<p>What exactly is a phantom test? Consider this example. Say you have a function AccumulateWhenGreen(value, condition). The parameters are:<\/p>\n<ul>\n<li><strong>Value<\/strong> &#8212; the number to add to the accumulation<\/li>\n<li><strong>Condition<\/strong> &#8212; red, yellow, or green indicating some status<\/li>\n<\/ul>\n<p>The name infers that it should accumulate the given value when the condition is green and skip the value when the condition is red or yellow. To evaluate the function, write a unit test (this is in pseudo-code rather than any particular language).<\/p>\n<pre class=\"lang:c# theme:vs2012\">Test \u201cAccumulateWhenGreen skips value when red\u201d\r\n{\r\n    Accumulator &lt;- 0\r\n    AccumulateWhenGreen(23, Condition.Red)\r\n    Assert Accumulator = 0\r\n}<\/pre>\n<p>If that test passes, the function has successfully fulfilled that clause in the contract, right? (By <em>contract<\/em> I mean the software requirements to be implemented.) Not so fast. Look at the pseudo-code for AccumulateWhenGreen itself:<\/p>\n<pre class=\"lang:c# theme:vs2012\">Function AccumulateWhenGreen(Value::int, Condition:: ConditionType)\r\n{\r\n    If (Condition is ConditionType.Green)\r\n    Then Accumulator &lt;- Accumulator + Value\r\n    Else DoNothing\r\n}<\/pre>\n<p>That code is correctly written to implement the relevant requirement. However, do not take my word for it&#8211; prove it. Change the test, so the second argument passed to AccumulateWhenGreen is Condition.Green instead of Condition.Red. What happens to the test? Now the test fails, because the value gets added to the accumulator and thus the accumulator is non-zero. Finally, change the parameter to Condition.Yellow and the test again passes. Q.E.D.<\/p>\n<p>So far so good. Now consider this alternate implementation of AccumulateWhenGreen:<\/p>\n<pre class=\"lang:c# theme:vs2012\">Function AccumulateWhenGreen(Value::int, Condition:: ConditionType)\r\n{\r\n    Accumulator &lt;- Accumulator + Value - (((Value * 3) + 6) \/ 3) + 2\r\n}<\/pre>\n<p>At first glance, it looks like this does some convoluted computation to make that code do what it is supposed to. Moreover, the \u201cAccumulateWhenGreen skips value when red\u201d unit test will pass: the accumulator will not be changed. Is it because that computation somehow takes the input value \u201c23\u201d into account? No; the unit test will pass for <em>any<\/em> integer you care to provide to the function. That\u2019s good, right? Why is the code working? The answer is that it is not. Sure, the test passes for Condition.Red. It also passes for Condition.Yellow. Fine. However, for Condition.Green, <em>the test still passes<\/em> when it should fail, because the accumulator is supposed to change for Condition.Green.<\/p>\n<p>At the very beginning I mentioned three things the code could do:<\/p>\n<p>(a) the correct thing<\/p>\n<p>(b) something unrelated<\/p>\n<p>(c) nothing<\/p>\n<p>In this case, the code is doing something unrelated. Notice that the Condition argument is suspiciously absent from the calculation. Paraphrasing a professor from my university days, the code is providing an answer to <em>some<\/em> question, just not the correct one! Consider the third alternative, doing nothing. With this code\u2026<\/p>\n<pre class=\"lang:c# theme:vs2012\">Function AccumulateWhenGreen(Value::int, Condition:: ConditionType)\r\n{\r\n    DoNothing\r\n}<\/pre>\n<p>\u2026and get the same results: the unit test passes for Condition.Red and for Condition.Yellow \u2013 both of which are good news \u2013 and for Condition.Green, which is bad news.<\/p>\n<p>How to solve this? Recall above that, with the correct code in place, the test passed for Condition.Red or for Condition.Yellow but failed with Condition.Green. One way to avoid the phantom menace is to add more tests:<\/p>\n<pre class=\"lang:c# theme:vs2012\">Test \u201cAccumulateWhenGreen skips value when red\u201d\r\n{\r\n    Accumulator &lt;- 0\r\n    AccumulateWhenGreen(23, Condition.Red)\r\n    Assert Accumulator = 0\r\n}\r\n\r\nTest \u201cAccumulateWhenGreen skips value when yellow\u201d\r\n{\r\n    Accumulator &lt;- 0\r\n    AccumulateWhenGreen(23, Condition.Yellow)\r\n    Assert Accumulator = 0\r\n}\r\n\r\nTest \u201cAccumulateWhenGreen adds value when green\u201d\r\n{\r\n    Accumulator &lt;- 0\r\n    AccumulateWhenGreen(23, Condition.Green)\r\n    Assert Accumulator = 23\r\n}<\/pre>\n<p>With this suite of tests in place, the correct code&#8211;case (a)&#8211;passes all three tests, but the incorrect code&#8211;cases (b) or (c)&#8211;fails on the third test.<\/p>\n<div class=\"background-color--grey--1\">\n<p style=\"text-align: center;\"><strong>Maxim #1:<\/strong><\/p>\n<p class=\"color--blue--6\" style=\"text-align: center;\"><em>A test checking that <strong>nothing<\/strong> happened<\/em><\/p>\n<p class=\"color--blue--6\" style=\"text-align: center;\"><em>must be accompanied<\/em><\/p>\n<p class=\"color--blue--6\" style=\"text-align: center;\"><em>by a test checking that <strong>something<\/strong> happened<\/em><\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>Notice that the tests for Condition.Red and for Condition.Yellow passed with the correct code, unrelated code, or no code under test. That is, <em>they always passed<\/em>. Do they actually serve a purpose then? Yes! At this moment, those tests always pass, so you may conclude that if they pass, they provide no useful information about the correctness of the SUT. However, if down the road, upon making changes to the SUT they start failing, then those changes did introduce some problem.<\/p>\n<div class=\"background-color--grey--1\">\n<p style=\"text-align: center;\"><strong>Maxim #2:<\/strong><\/p>\n<p class=\"color--blue--6\" style=\"text-align: center;\"><em>A phantom test proves nothing if it passes.<\/em><\/p>\n<p class=\"color--blue--6\" style=\"text-align: center;\"><em>It indicates a real problem if it fails.<\/em><\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>Can you make a phantom test more solid (pun intended)? That is, can you make a test that is supposed to confirm nothing happened mean that nothing happened <em>correctly<\/em>? (Again, that means (a) the SUT has correct code rather than (b) unrelated code or (c) no code.)<\/p>\n<p>Yes! Bring on our guest practice for this segment &#8212; <em>test-driven development<\/em> (TDD). Whether or not you use TDD, whether or not you write your tests first or last, the following can help you do (apologies to non-native English speakers for being cute here!) well, nothing. More formally, the following can help you create purportedly phantom tests\u2014tests that confirm that nothing happened\u2014in a way that you can have confidence that the code did that no-op in a correct manner.<\/p>\n<p>TDD principles state that when you want to add new functionality, you first add a new test and that the <em>new test must fail<\/em>. If it does not fail, then either you have added a test for something your system already does, and presumably you already have a test for already, or you have added a phantom test, and the test will always pass.<\/p>\n<p>Here is one way the story might have unfolded with our sample code and tests:<\/p>\n<p>Create the first, happy path test:<\/p>\n<pre class=\"lang:c# theme:vs2012\">Test \u201cAccumulateWhenGreen adds value when green\u201d\r\n{\r\n    Accumulator &lt;- 0\r\n    AccumulateWhenGreen(23, Condition.Green)\r\n    Assert Accumulator = 23\r\n}<\/pre>\n<p>Write some code to make it pass:<\/p>\n<pre class=\"lang:c# theme:vs2012\">Function AccumulateWhenGreen(Value::int, Condition:: ConditionType)\r\n{\r\n    Accumulator &lt;- Accumulator + Value\r\n}<\/pre>\n<p>Add the next two tests together:<\/p>\n<pre class=\"lang:c# theme:vs2012\">Test \u201cAccumulateWhenGreen skips value when red\u201d\r\n{\r\n    Accumulator &lt;- 0\r\n    AccumulateWhenGreen(23, Condition.Red)\r\n    Assert Accumulator = 0\r\n}\r\n\r\nTest \u201cAccumulateWhenGreen skips value when yellow\u201d\r\n{\r\n    Accumulator &lt;- 0\r\n    AccumulateWhenGreen(23, Condition.Yellow)\r\n    Assert Accumulator = 0\r\n}<\/pre>\n<p>Both of those tests will fail. By Maxim #2, that says there is a problem to fix, as there should be. Write some more code that makes those tests now pass, adding the conditional in this case:<\/p>\n<pre class=\"lang:c# theme:vs2012 \">Function AccumulateWhenGreen(Value::int, Condition:: ConditionType)\r\n{\r\n    If (Condition is ConditionType.Green)\r\n    Then Accumulator &lt;- Accumulator + Value\r\n    Else DoNothing\r\n}<\/pre>\n<p>With that in place, all tests pass. Moreover, those two new tests are now purportedly phantom tests. However, they began as failing tests and turned into passing tests as the code evolved, so they have provided value.<\/p>\n<div class=\"background-color--grey--1\">\n<p style=\"text-align: center;\"><strong>Maxim #3:<\/strong><\/p>\n<p class=\"color--blue--6\" style=\"text-align: center;\"><em>How you arrive at a phantom test matters.<\/em><\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>I illustrated the above with TDD because it is vital that when you introduce a new test that it first fails. You can sometimes meet this requirement in a non-TDD fashion, but it takes more work. Either you need to add some more logic to your test to get to a state where it will fail, or you need to break your working SUT so that the test fails. Once you confirm that the test is failing for the right reason, then make it pass by backing out those artificial tweaks.<\/p>\n<h2>A Real-World Example: The Authorisation Problem<\/h2>\n<p>Sometimes you have to live with the presence of phantom tests, but often you can convert them to real, non-phantom tests. To illustrate this point, turn from the above academic example to consider a real-world example. Say you are designing an <a href=\"https:\/\/en.wikipedia.org\/wiki\/Authorization\">authorisation system<\/a> to regulate access rights to resources in your enterprise system. One typical foundation of such an authorisation system might succinctly be:<\/p>\n<p><em>User U is authorised to perform an action A if there is a policy that allows U to perform A <\/em><strong><em>and<\/em><\/strong><em> there is no policy that denies U from performing A. <\/em><\/p>\n<p>Here are the tests you might come up with.<\/p>\n<p><strong>T<sub>0<\/sub> \u2013 With no policies, an action is denied<\/strong><\/p>\n<p>That certainly is part of the contract because, per the stated requirement, there is no policy allowing the action, therefore the result should be denied.<\/p>\n<p><strong>T<sub>1<\/sub> \u2013 With a policy allowing an action, the action is allowed<\/strong><\/p>\n<p>Clearly, from the requirement, the presence of such a policy should result in the action being allowed. Designate this policy P<sub>1<\/sub>, as you will use it again shortly.<\/p>\n<p>Next, the test to confirm that when there is a policy that denies U from performing A, the authorisation decision is, in fact, \u201cdenied\u201d.<\/p>\n<p><strong>T<sub>2<\/sub> \u2013 With a policy denying an action, the action is denied<\/strong><\/p>\n<p>Here create a single policy P<sub>2<\/sub> that denies U from performing A and check the resultThe result should be \u201cdenied,\u201d but what does that tell us? If you remove P<sub>2<\/sub>\u2014where you now have <em>no policies at all<\/em>\u2014then check the result; it will <em>still<\/em> come back with \u201cdenied\u201d. Why? Because there was no policy <em>allowing<\/em> A. This test is a classic phantom test: the requirement is to test that the presence of P<sub>2<\/sub> <em>caused the outcome to be denied<\/em>. Yet removing P<sub>2<\/sub> yielded the same result, so the fact that the test passes does <strong>not<\/strong> prove anything.<\/p>\n<p>You could turn this phantom test into a solid test, though, by bringing in policy P<sub>1<\/sub> that was created earlier. It allows U to perform A. Thus, instead of using just P<sub>2<\/sub> in this test use P<sub>1<\/sub> + P<sub>2<\/sub>. If the result is \u201cdenied,\u201d it is due to the presence of P<sub>2<\/sub>. Can you prove that? Certainly. If you remove P<sub>2<\/sub> the result will be \u201callowed\u201d because there exists a policy (P<sub>1<\/sub>) that allows U to perform A. Therefore, the test will fail. This test\u2014with P<sub>1<\/sub> + P<sub>2<\/sub>\u2014is now a solid test!<\/p>\n<div class=\"background-color--grey--1\">\n<p style=\"text-align: center;\"><strong>Maxim #4:<\/strong><\/p>\n<p class=\"color--blue--6\" style=\"text-align: center;\"><em>Whenever possible convert phantom tests to real tests.<\/em><\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<h2>Conclusion<\/h2>\n<p>Phantom tests are sneaky. They can be hard to spot, and they provide a false sense of security. You have taken the first step to combat phantom tests just by being aware of their existence. When you do uncover a phantom test, look for ways to turn it into a solid test, so that it does not always pass. You should be able to make the test fail then, by adding in the key thing you want, make it pass. If, however, you must have a test that checks for nothing happening, make sure it is at least accompanied by a test that checks for something happening, too.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Testing is a challenging yet crucial part of software development, but how do you know that a test is telling you what you need to know? In this article, Michael Sorens explores the concept of phantom tests that return correct results but don\u2019t actually prove anything.&hellip;<\/p>\n","protected":false},"author":221868,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143519],"tags":[95509],"coauthors":[6802],"class_list":["post-84184","post","type-post","status-publish","format-standard","hentry","category-testing","tag-standardize"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/84184","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/221868"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=84184"}],"version-history":[{"count":19,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/84184\/revisions"}],"predecessor-version":[{"id":90715,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/84184\/revisions\/90715"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=84184"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=84184"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=84184"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=84184"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}