The 2002 movie Minority Report is about a police unit called PreCrime, which can predict when people will commit a crime so they can be arrested before it happens. Things go awry when a team member played by Tom Cruise is himself “pre-accused.”
To make predictions, they use technology and the special abilities of the three cops on the team. While we don’t have the capabilities of the PreCrime unit today, machine learning is being used to make predictions and influence behavior.
The most nefarious example involves a British company called Cambridge Analytica which improperly used Facebook data to influence the 2016 US presidential election. However, attempts to predict results and manipulate voters with data go back to the 1960s. From movie suggestions on Netflix to auto-correct to scanning faces instead of boarding passes, we can’t get away from artificial intelligence.
While some of this makes our lives more convenient, it feels unnerving thinking about all the collected and analyzed data about each of us. For example, you can see your Google activity on the myactivity page, and you may notice that ads are tailored to things you search and, possibly, what you talk about.
We are in an awkward position in the data platform community. Data science is an exciting and growing field that many of us work in or are training for as the next career move — even though we don’t want to be part of the data collected ourselves. The Data Platform WIT group recently held a Beginning Data Science day, and I have to admit that the topics were compelling.
Data scientists were mostly PhDs with years of training in the past, but now Microsoft and other organizations have “democratized AI.” With drag and drop interfaces, it’s easy to set up a machine learning experiment, but will it be accurate if the user doesn’t understand machine learning? On the other hand, does knowing how to do the same thing in Python make a difference if you don’t understand the data or have biases? For example, some face recognition apps only recognize white faces, and hundreds of algorithms for treating Covid-19 are worthless.
Machine learning is here to stay. As the tools become easier to use, it’s essential to ensure that those using them have the proper training and knowledge.