Crowdsourcing quality: Can the hive mind/herd mentality help you in software testing?

Crowdsourcing is one way to get a job done. In this article, Devyani Borade describes crowdsourcing quality testing of software.

When I left my employer, 856 people took my place. My job was being done by a village, and not just any village; it was a global village. My role as a software tester was outsourced to a crowd: crowdsourced.

From the time kings invited their populace to try their hand at solving the day’s domestic disturbances, to when the Oxford English Dictionary was compiled from references mailed in by readers, to SETI@home harnessing the power of idle personal computers, crowdsourcing has always been present sometimes hovering in the background, other times appearing in plain sight. Simply put, crowdsourcing means recruiting masses of people of various skills and experiences to perform a certain set of tasks in parallel in a short time.

However, while crowdsourcing seems to work well in situations where user-generated content is useful to build up knowledge, provide first-hand feedback, or raise funds for a good cause, would it also work when it came to improving quality of software? I set out to find more.

The benefits of crowdsourcing software testing

Crowdsourcing testing involves the company hiring amateur and professional testers, assigning them projects to work on, and then paying them (and charging their own customer) according to different pricing models. By far the most popular is pay-per-bug: when a tester reports a bug that the customer agrees to fix, the customer pays the company an agreed amount of money, a small part of which filters down to the tester who found the bug. This amount often depends on the severity and criticality of the bug. With advancements in web communication technology, crowdsourcing testing appears to become an increasingly attractive option in some cases.

Some of the benefits of crowdsourcing testing are:

  • Results drive costs: no results, no money.
  • The crowdsourcing company benefits from on-demand scalability and availability of resources within a short period of time.
  • The customer benefits from a wide variety of skills and talent that may not be available to exploited in-house.
  • Shared product ownership results in brand loyalty and community-building for the greater good.
  • Crowdsourcing is particularly suited to mobile app testing, where there are thousands of OS-Phone-Browser combinations possible making it unfeasible for a single organisation to own so many different products for testing, and simulation is unreliable and risky.

Questions to ask before crowdsourcing software testing

However, there are a number of questions that need to be carefully considered in order to accept crowdsourcing as the next silver bullet of software testing.

Confidentiality: Company projects are confidential. Anyone can join up to test a project and get access to the project information and not much is known or verified about their backgrounds, qualifications, or references. There are no formal vetting checks carried out to ensure that genuine testers are signing on to be part of the project. Sure, you could put hefty NDAs in place, but that knowledge is of little comfort once something of importance leaks and the damage is done. How is confidentiality secured? Does the customer select the testers themselves, or do they have to make do with whoever responds and shows interest in the project? Once assigned, are long-term relationships off-platform (cutting out the middleman) with testers possible?

Business requirements: Crowdsourcing works when the functionality is standard and well known, like an eCommerce sales flow. However, where projects are highly customised and require a thorough knowledge of the customer’s account history, what changes were done and why, and understanding what the customer’s requirements are, how will crowdsourcing random and unknown testers work?

Budget: In smaller companies averaging, say, around 100 bugs per project, at the standard rates for a pay-per-bug model, per project, crowdsourcing may turn out to be very expensive indeed. Could there be scope for cheating, in that the customer could review all the bugs and then decide that they wanted to fix them all but formally accept only a few to keep the costs down? In fact, to keep the costs down, ideally, the customer would want their programmers to only pass on the code to crowdtest after they’re assured of a reasonable level of quality. Does this not imply that there needs to be a certain amount of internal testing done beforehand anyway – either by programmers themselves testing each other’s code, or by an in-house testing team?

Time: There are several time-related issues to consider:

  • Overhead involved for the person(s) who will go through the bug reports, often containing duplicate bugs that need to be filtered out.
  • Overhead in advertising for and then selecting the testers desired for each project.
  • Overhead in creating and providing adequate project documentation for each project – this is especially true and risky for companies that are more agile in their development practices and don’t maintain vast amounts of comprehensive documentation.
  • Delays in communication between programmers and testers if each needs to go through an intermediary. (This also implies loss of traceability if the bugs are logged anonymously, and the programmers don’t know whom to ask for details.)
  • Delays in testing feedback turn-around if bug reports need to be reviewed in-house first by the crowdsourcing company before being released to the customer.
  • Delays in dispute resolution, depending on how much staff is available at the crowdsourcing company to support this.

Bug Quality: Bug reports by testers with varying levels of professionalism may not be accurately worded and meaningful. They may not contain enough information about the bug and its environment, something that comes to experienced professional testers almost as second nature. Worse, some reports may be misleading and time wasters.

Requirements coverage and reporting: What is the guarantee that most testers working on the project won’t focus on the bugs that pay more and thus end up ignoring or leaving unfinished other testing aspects of the project? How does the customer make sure that the entire project scope has been covered fully without having an in-house test manager? Is there provision for traceability to requirements and to draw some progress metrics out from the system?

Agility: For an agile setup where requirements evolve regularly, documentation is kept to a minimum, and communication between team members is key. Crowdsourcing the testing function inherently means a major change in the way the development team works. To ensure that testers understand the full scope and impact of the project, documentation will need to be produced in fine detail with expectations clearly highlighted and transparent. This will also require the investment of resources to keep such documentation up to date. Requirements will have to be frozen and baselined at some point so that the testers can be briefed with reasonable confidence. The testing function will be bolted on at the end of the entire project development lifecycle instead of continuous collaboration with programmers. And finally, communications may break down, either due to time-zone differences not providing an opportunity for direct interaction with the customer representatives, or due to having to wade through a middle-management layer where things can get delayed or lost in translation.

In other words, not agile.

Cultural differences: In a setup where people of different behaviours and work ethics attempt to work together towards achieving a common goal, there is bound to be some uneven interactions. It may be difficult to understand each other’s languages and thought processes. 

Thinking and mindsets vary wildly. At one end of the spectrum is the tester for whom it is perfectly acceptable to challenge a programmer directly or provocatively. Such a tester’s style of communication may be misconstrued as rude or aggressive. At the other end is the tester who may take for granted that they shouldn’t object to anything a programmer says or does out of politeness and respect. Such a tester’s style may be misconstrued as lazy or not proactive.

And what of test data? Where one might find a name like ‘X Æ A-12’ objectionable, another might find it acceptable.

Production environments: When a patch fix for an eCommerce site goes live, company testers put through test orders on the live system to ensure that checkouts and payments have not broken. These test orders are placed in the live system and need to be refunded. Company testers are usually privy to login credentials of several payment gateways. It is virtually impossible for crowdsourced testing to be able to take responsibility for this.

Bigger picture and fit for purpose: Often real value from testing comes not only from detecting bugs but also from being able to look at the entire project at a holistic level, have an overview of the end-to-end business problem and solution and be able to bridge gaps where cracks exist and things have the potential of falling through. With compartmentalised crowdsourced testing, this bigger picture is lost, thus automatically losing a large part of testing effectiveness.

Senior business users who perform user acceptance testing have deep domain expertise. From knowledge and experience, they know which parts of the system are used to what degree, what to review, and in exactly what way. In crowdsourcing, random testers act as users and may not know how the system is intended for use.

Conclusion

If your organisation is looking to crowdsource quality testing as an answer to its prayers, it is worthwhile to carry out due diligence and consider some of the advantages and pitfalls of the concept before making any firm decisions. Conversely, if you own or run a crowdsourcing setup, it pays to be prepared with answers to the most commonly asked questions your potential customers are likely to bring up.