Smart Surgery

How Big Data Mining in the Hospital Actually Works

05 Oct 2018

It’s 2018, and the Data Revolution is in full-swing.

With more ways to collect information about ourselves and the world around us - in combination with advanced analytics, AI, and machine learning - data mining applications have the potential to greatly alter our lives and our measurement of life’s most vital elements.

Among industries, healthcare demonstrates the highest potential for data applications, owing to the vast quantities of untapped patient data resources. From predictive and diagnostic tools to patient-specific medication and treatment, data mining and advanced analytics are already reshaping the kind of care patients receive.

While these technologies constitute the forefront of medicine, that is not to say that are not without their own set of challenges. Collecting information stored in disparate hospital systems poses significant logistical and technical challenges. Maintaining patient privacy and protecting hospital systems is also paramount. Beyond that, managing and securely storing the vast sums of data is difficult and costly for providers.

Navigating New Terrain: Data mining. Big data. Advanced analytics.

A host of buzzwords are floating around the healthcare data space, but you don't need a degree in data science to get the picture.

In the hospital, collected patient information (in multiple formats) is usually stored across a few different platforms-- EHRs, PACs, HIS, etc.

Data mining is essentially a two-part process:

1) Collecting, cleaning, and organizing the disparate data sources

2) Finding patterns within these data sets.

Defining the difference between data mining and ‘big data’ is more murky, but it essentially boils down to scale. Big data takes together the various patterns revealed by data mining and compares them in order to create some new, somewhat broader insight.

Leveraging big data, hospitals learn more about risk: operational, financial, and risk to patient safety and outcomes. Machine learning and AI are the next step in this progression, as the more information data technologists collect and aggregate, the more capable they are to start making predictions (risk calculation, e.g.) and informing treatment pathways (automated decision support, e.g.).

This is just a snapshot of the future of healthcare.

Roadblocks Ahead

While big data is already changing healthcare, the usage, security, and anonymization of patient records remains a big concern. HIPAA in the US and the GDPR in the EU both set out stringent rules to ensure patient data privacy. Ensuring compliance and building more secure systems are both vital as big data usage continues to grow - in the last 3 years alone there have been 955 HIPAA data breaches. GDPR places even heavier restriction on how patient data must be stored.

The optimal balance between the “right to innovate ” and the right to privacy remains hotly contested and will continue to present a hurdle for data technologists.


Data usage is already transforming healthcare. While challenges on how best to address privacy concerns remain at the forefront of the conversation, AI and machine learning are expected to make huge impact on healthcare tech, finance, quality, and ultimately, outcomes. Preventative and predictive care is the next phase -- and big data is how we’re getting there.

New call-to-action

Sean Witry

Written by Sean Witry

Sean Witry works as Community Manager and Content Marketer at caresyntax. With a background in international affairs and a focus in public health, he's passionate about exploring technology's role in increasing access to quality care and improving patient outcomes. You can reach him on LinkedIn or via email at

Recent News

New call-to-action