If you become a data scientist, sooner or later you are going to find yourself facing questions of ethics and morality surrounding the work you do with massive datasets. The very feature of broad, deep, insightful data analysis that makes it so powerful as a predictive tool also turns it into a threat against individual privacy. For businesses, governments, and other task-oriented organizations, deep data analysis on enormous volumes of information provides a window into the lives of the people who generate that information. But it’s a window those folks may not want opened.<!- mfunc feat_school ->
This can be dangerous for corporations, as well. A 2013 survey found that nearly a third of consumers had stopped doing business with a particular company over perceived violations of data privacy. That’s exactly opposite the outcome businesses are hoping for when using big data and it’s a line data scientists have to learn to tread carefully.
The Benefits: Big Data Brings Insight and Clarity Into Mysterious Trends
Many of the benefits of big data are economic in nature, but that doesn’t automatically introduce a moral stain to the field: economics is the science of the production, distribution, and consumption of goods and services. It’s a look at the forces in modern society covering the exchange and distribution of wealth. By understanding those forces better, the systems that govern those exchanges can be run more efficiently and humanely than simply tinkering with them by trial and error.
Even when a commercial profit motive exists for big data analysis, the benefits may not rest solely with the corporation.
Knowing People Better Allows Companies to Better Serve Them
Privacy advocates are quick to cast aspersions on the efforts big business is putting forward to collect information and use it to profile individuals, but there’s no profit in upsetting customers. Many people do not recognize the extent to which data about them is collected and used to profile them and predict their wants and needs. And not everybody is going to feel better about it when they’re told it’s all being done in an effort to fulfill those wants and needs.
Google and Facebook go to great lengths to build profiles of individual users in order to show them advertisements that will appeal to them. The enhanced personalization is largely about making sure those advertisements are ones users are actually interested in seeing, however. Some would argue this really is a win/win/win for advertisers, media company, and user. Users with wants and needs are having those desires addressed more directly than they would otherwise, all because of big data.
Similarly, Google alters search results based on what it knows about you, not to manipulate how you think, but to deliver results more in line with what you are likely to be looking for. If it seems creepy for the search giant to know that you are in New York and give you nearby results when you search for “chinese restaurants,” it’s also more useful than seeing a list of eateries in Beijing.
Other companies use big data to streamline their operations, making life easier for customers. Amazon, famously, used big data analysis to find that sometimes using oversized boxes for small orders could improve the shipping speed and efficiency, something that would’ve gone completely overlooked if not for a deep dive into relevant data.
Big Data Roots Out Social and Health Problems That Otherwise Go Unexplained
Analysis of big data is also key to unearthing and correcting social and medical ills that may otherwise be under-appreciated or undiagnosable.
Much of the recent trend to correct social injustice in the criminal justice system in the United States, for example, has been fueled by analysis of statistical data on arrest rates, traffic stops, convictions, and parole outcomes that otherwise existed mostly under the radar for most Americans. Non-profits focusing on this information have been able to call awareness to deeply-seated racial injustices that had long been unexamined.
In healthcare, analytics focusing on big sets of medical data have identified markers in vital signs showing potential signs of infections in premature infants up to 24 hours before more conventional diagnostics would alert doctors to the danger. Similar explorations into trends and patterns only possible using advanced analytics will save tens of thousands of lives in the coming years.
And in public safety, governments have begun to use big data to predict everything from earthquake aftershocks to the probability of crime in a given location, allowing them to deploy resources to head off danger before it hits.
The Concerns: Big Data Can Lead to Big Brother Controls Over Individuals
There is no question that these benefits come with some drawbacks, however. Information can be used for good or ill. Nowhere has that become more evident than in the United States in 2018 as evidence has slowly emerged to indicate that big data may have played a role in attempts at illicit manipulation of the 2016 presidential election.
Knowing People Better Allows Companies to Better Serve Them… as Dinner
Although Facebook and Google use the information they gather to provide real benefits to their users, they are also happy to sell that information off to other businesses that may offer no advantages to those customers at all. Stuffed mailboxes and bouts of spam email may be the only result of those transactions.
Perhaps even worse, other companies have outright lied to customers in order to collect data that should never have been shared in the first place. Urban Outfitters was forced to settle a class action lawsuit over the collection of customer ZIP codes in 2015; it told customer’s they had to give their ZIP codes with credit card transactions as a part of the payment process, when in fact the company used the information solely to mine addresses for advertising purposes.
Big Data Can Provide One-Stop Shopping For Thieves
Big data breaches are happening so frequently that they are almost impossible to catalog now. After a record breach at the Equifax credit reporting agency in 2017 exposed the social security numbers and other personal banking and credit information of almost half of all Americans, many security professionals simply shrugged… most of the data had already been exposed, piece-meal, in thousands of other attacks that had already occurred.
When firms fail to secure data and expose consumers to crime, then their own intentions for using that data become irrelevant.
It’s not just bad guys that can inadvertently get damaging information from big data, either. The case where Target tipped a father to his teen daughter’s pregnancy before she had told him about it has entered big data folklore, but it’s a good example of how predictive marketing can share details about individuals with people they would rather not know them.
Data scientists have to consider not only their own potential use of large data sets, but also the possible outcomes of having those data sets compromised. Although privacy advocates make the case that government protections of private data are weak, courts have consistently weighed in with heavy fines and awards in lawsuits after personal information has been exposed. Target paid almost $20 million to settle charges over a 2013 data breach, and health insurer Anthem reached a record $115 million agreement in 2017 to settle a class action lawsuit over the loss of personal records from 79 million customers.
Big data isn’t going anywhere, leaving society to do some soul-searching and analysis of its own about the costs and benefits of these collection and analytical methods.