Preparing for a Data Science Career in Healthcare with a Master’s Degree

Sponsored School Search


Healthcare has the same technological drivers that other industries do when it comes to data management:

  • New sensor technology has dramatically increased the frequency and reliability of data being generated by individual patients.
  • Governmental regulation is demanding that more data be tracked more carefully for purposes of compliance.
  • Tighter margins are pushing management to look for improved information on the options and consequences of major strategic business decisions.

Apart from that, however, clinicians and researchers alike are beginning to understand the applicability of data mining on vast scales when it comes to diagnosing and treating many medical conditions.

But all data science doesn’t revolve around data mining. Some of the more exciting applications of data science to healthcare come from the other end of the scale—precise analytics applied to discrete information about individual patients.

Devising algorithms and techniques to analyze this information to generate actionable information for clinicians increasingly relies on a very high level of expertise. That sort of proficiency in data analysis is a big reason why many data scientists seeking careers in health care are now opting to obtain a master’s degree.

The Evolution of Personalized Healthcare: From Spray and Pray to Custom-Tailored Treatment Plans

With the increase in availability and decrease in cost of wearable health monitors and the emergence of biointegrated sensors, we’ve entered a new era in which doctors are able to conduct precise and continuous physiological monitoring of patients in all phases of their daily lives—not just in the moments they are under observation in a hospital. Once thorny and nebulous diagnoses become simpler as continuous monitoring reveals intermittent symptoms that could indicate serious conditions.

Moreover, the ability to integrate that data with other diverse patient information like sleep cycles, sedentary versus active waking time, even time spent talking to other people allows for a greater level of understanding of corollary and causal factors. Each of us has a unique combination of genetics and lifestyle factors that combine to inform our physiology.

Medical outcomes can be as much dependent on the patient as the mechanism of treatment. Certain treatments work well for certain classes of patient, but doctors are largely playing a numbers game when they prescribe drugs or therapies, advocating courses of treatment that have been found to work more often than not for most people.

There is every reason to believe that with additional individual information and a better understanding of how to use it, treatments can be developed and used that will always be the most effective for the particular patient being treated, not simply for the statistical masses. And this has a real cost, not just in terms of outcomes, but financially: It is estimated that some $600 billion in annual healthcare costs in the US are attributable to treatment variations that fail to improve patient outcomes.

How Data Science is Driving Virtual Surgical Planning and the Development of Disposable Patient-Specific Medical Hardware

While integrating diverse data sources for information about one particular patient is in and of itself a killer app for healthcare data science, the prospects are even brighter when doctors can also incorporate other massive data sets into their diagnostics.

  • Outcomes tracking can be used to find comparative efficacy for different treatment plans
  • Genetics research can be used to suggest lines of investigation for both diagnosis and treatment
  • Epidemiological data tracking disease outbreaks and progression can point doctors toward disease vectors they might not otherwise have immediately considered in their diagnoses

The advantages of personalization don’t end there. A new sub-discipline called Virtual Surgical Planning has started to emerge at the confluence of advanced medical imaging, surgical event simulation, and 3-D printing. Taking in-depth medical imaging data from an individual patient in need of surgery, accurate 3-D models are developed of the surgical site. Surgeons are able to visualize and practice their procedure repeatedly and accurately with the model. Equally importantly, patient-specific, disposable instruments and hardware can be produced for use in the procedure.

Such techniques reduce risk in advanced surgical procedures and allow surgeons to customize their tools and approach to allow for differences in individual physiognomy between patients.

Resolving the Conflict Between Privacy Concerns and Medical Innovation

With all this highly personalized data comes additional management challenges. In particular, both law and ethics require the right to privacy when it comes to a patient’s medical information. The complexity of securing such information while still using it most effectively is a thorny problem that will claim a lot of attention from data scientists in the healthcare world.

HIPAA – the Health Insurance Portability and Accountability Act of 1996 – forced a sea change in the way that medical providers stored and shared patient data. A touchstone for patient privacy advocates, HIPAA was also designed to encourage the establishment and use of electronic medical records (or EHR, Electronic Health Records).

These rules have created a tension between them, tension that only data scientists are likely to resolve. Many of the customary tools and techniques of data analysis are impinged by HIPAA requirements. Cloud data storage, a tool used without a second thought by most data scientists, may require special security for HIPAA compliance. Records themselves may have to be scrubbed of certain information – an anathema to data analysts. A lot of intellectual horsepower is being thrown at the problems of how to best use all the new information being generated by EHR without incurring the wrath of the Department of Health and Human Services, privacy advocates and the public.

And their wrath has real teeth. Even in cases where violators may not have known – and indeed could not have known – they were found to be in violation and slapped with fines of $100 per violation, with precedents in place for $1.5 million in total fines. Compliance requirements are breeding an entirely new career path in healthcare data science, simply to analyze and guard against HIPAA violations.

Data Science in Healthcare Could be Worth $100 Billion … Mostly in Savings

Even with the limitations imposed by HIPAA, however, the healthcare industry is poised for a revolution on the back of advances in data science. A 2011 study conducted by McKinsey Global Institute, the economic and business research arm of consulting firm McKinsey & Company, suggests that the U.S. healthcare system could drive more than $100 billion in new value each year with the creative and effective use of big data. The bulk of those savings would come from reduction in healthcare expenditures… welcome news to a public overwhelmed with rapidly rising healthcare costs.

There are many legitimate use cases for patient data even within the constraints of privacy regimes like HIPAA. Big Data excels at working from large sets of data composed of many individual pieces of information that may not even be personally sensitive. Google’s Flu Trends website was a case in point. Aggregating anonymous search data from wide swaths of users, data scientists at the company were able to track influenza outbreaks with considerable accuracy merely from specific search terms often used by the infected.

Similar studies, and results, are no doubt waiting right around the corner for the next bright graduate of a data science master’s program to uncover.

Back to Top