AI, Big Data and future healthcare

July 1, 2025 •

By Mayo Clinic Press Editors

Please login to bookmark

©GettyImages

Can artificial intelligence, or AI, save lives? While not directly, recent innovations in AI have improved the data used by medical professionals to save lives. A real-life example of AI in action can be read about in the following excerpt from Transform by Mayo Clinic experts Paul Cerrato, M.A., and John D. Halamka, M.D., M.S. These authors use facts and real examples to create a clear understanding of how experts are using AI to transform healthcare into a more productive and inclusive service. Learn more about digital innovation and the future of healthcare in Transform.

Peter Maercklein, a retired financial executive in his early seventies discovered he had atrial fibrillation (AFib)—irregular fluttering heartbeats that can lead to blood clots—with the help of an artificial intelligence (AI)-powered algorithm. Like many people with AFib, Peter didn’t feel symptoms. He and his wife were enjoying retirement in southeastern Minnesota, traveling often in their RV and spending much of the summer up north in Grand Marais. In 2020, Mayo Clinic coordinators in Rochester asked him to participate in a research study. Now complete and published, that study evaluated AI-guided screening for AFib using electrocardiograms (ECGs) taken during normal heart rhythm to identify previously unrecognized AFib. Peter’s AI-ECG showed he had an 81.49% probability of experiencing AFib, so he was outfitted with a Holter monitor to record his heart rhythm over time. Within a few days, the monitor confirmed that Peter had AFib while he was walking on a treadmill at home. He saw his care team and had further testing to confirm the diagnosis, and then went on a blood thinner medication to reduce his risk of having a stroke. Later, Peter had a pacemaker implanted to control his heart rhythm.

Dr. Paul Friedman, one of the cardiologists involved in the study evaluating the benefits of using AI to detect AFib, points out that cardiovascular disease is the number 1 killer in the United States and worldwide, and that many cases of AFib are preventable. What if we could detect problems earlier and head off disease instead of waiting for a life-altering event like a heart attack, cardiac arrest, or stroke? That’s exactly what’s happening at Mayo Clinic and other leading medical centers around the world. Mayo Clinic uses AI to detect and predict heart disease using ECG readings, which are common, relatively inexpensive, and can also be obtained from wearable technology such as a smartwatch. AI can discern electrocardiographic changes that the human eye can’t, picking up electrical signatures of heart disease even before symptoms appear.

Mayo Clinic created an AI-ECG dashboard viewable in the electronic health record (EHR) that shows a patient’s probability of certain heart conditions such as AFib, low ejection fraction (weak heart pump), hypertrophic cardiomyopathy, or HCM (a thickening of the left ventricle, the lower main pumping chamber of the heart), aortic stenosis (calcification in the arteries), and amyloidosis, which involves a buildup of folded proteins. The dashboard provides multiple points for diagnosis, prognosis, and clinical care orchestration. Clinicians can compare AI evaluations of a patient’s ECGs over time and quickly get a wide-lens view of the person’s heart and circulatory system health. Patients can even send Apple Watch ECGs directly to their EHRs via a secure Mayo Clinic app, adding another checkpoint for monitoring changes. This information adds to a clinician’s knowledge and can be used to flag patients who need further testing and, potentially, therapy.

The advantage of AI is more far-reaching than many people realize. It is making innovative medical care more democratic. Across the United States and globally, not everyone has access to a large medical center with specialized diagnostics. The symptoms of some heart diseases are common to other conditions, so how do we more quickly and easily identify patients who need care? AI-ECG algorithms offer a relatively inexpensive way to spot disease and profile individuals who are at increased risk for heart disease. These investigational tools are being used as a guide at Mayo Clinic, and they’re being reviewed by regulators for broader commercial use. Once approved, we anticipate that these algorithms will be widely available and adopted globally to improve diagnosis and patient health.

While the value of AI-fueled algorithms is obvious, what often goes unnoticed is the collection of patient information—a massive dataset—that enables researchers to develop these digital tools in the first place. Without this data bank, it would be nearly impossible to conduct the clinical trials used to create the algorithms.

The power of large numbers

The data network that Mayo Clinic uses contains tens of millions of electronic patient records that can be tapped to gain insights into what causes specific diseases and how best to treat them. This is all part of the Big Data movement that has gained momentum in medicine in recent years. The value of such large numbers is amply illustrated by investigations into possible harms caused by certain prescription drugs. Typically, such medications are tested among a few thousand subjects in clinical trials. Unfortunately, patient populations of this size are often not large enough to detect relatively uncommon adverse effects.

A good example of the power of large numbers is a study conducted by David J. Graham, MD, MPH, the U.S. Food and Drug Administration (FDA) Associate Director for Science, Office of Drug Safety, and his associates. They analyzed records from approximately 1.4 million patients who were members of Kaiser Permanente in California. The aim of the research was to determine if rofecoxib (Vioxx), a cyclooxygenase 2 (COX-2) selective nonsteroidal anti-inflammatory drug, increased the risk of acute myocardial infarction—a type of heart attack—and sudden cardiac death. Graham and associates were able to review the equivalent of 2,302,029 person years of follow-up. In this population, they detected 8,142 cases of serious coronary heart disease (CHD), 2,210 of which were fatal. The patients taking any dose of the medication were 59% more likely to develop CHD than the controls. In patients who took 25 mg/day or less, the odds of developing CHD were 47% higher. Finally, among patients who took high doses, namely more than 25 mg daily, the odds of heart disease were 258% higher.

Several smaller, earlier studies published before the FDA data were available suggested an association between Vioxx and heart disease, but those findings had several shortcomings. The data from Graham and associates, which were presented at a conference before being published in The Lancet, made headlines and embroiled the researchers in a confrontation with FDA officials, who initially didn’t want the results made public. Vioxx was withdrawn from the market by Merck on September 30, 2004. But estimates indicate that, during the time the drug was on the market, more than 100 million prescriptions were written and between 88,000 and 140,000 excess cases of serious CHD may have resulted from the public’s exposure to Vioxx. Its removal was clearly a testament to the value of Big Data. The Graham study is only one of many that illustrate the benefits of using massive datasets to glean meaningful insights about healthcare.

The big data approach to healthcare

Large data networks have captured the attention of healthcare executives and clinicians. They’re part of the Big Data movement and a related specialty, data analytics. Big Data is usually distinguished from “small data” by its volume, velocity, and variety. The so-called 3-Vs reflect the fact that the amount of data available for analysis is huge, compared with the quantity of data that has traditionally been obtained from clinical trials and population studies. The databases currently being examined may include petabytes of data, which each contain 1,024 terabytes, or exabytes, which each contain 1,024 petabytes. (A terabyte contains 1,024 gigabytes.) They may comprise billions of patient records, such as EHR data, social media, claims data, and much more. The databases typically consist of structured data—for example, the ICD (International Classification of Diseases) codes for patients—and unstructured data such as narratives describing patients’ signs and symptoms. The speed or velocity with which these data are accumulating and at which they can be moved from place to place also distinguishes Big Data from more traditional sources of patient information, as does the variety of types of data, which can include input from remote sensors, text data on hard drives and smartphones, and imaging data from videos, photographs, and X-rays, among others.

All the data can be saved in different types of “storage bins,” including relational databases and data warehouses, and analyzed by linking the bins through a process called distributed computing, using a tool such as Hadoop (pronounced huh-DOOP). Data scientists also use terms such as “semantics,” “syntax,” and “ontology,” all of which have special meanings in the context of medical informatics.

In a simple computer file folder or directory, you might store information in individual files as Microsoft Office text documents or PDFs, and images as TIFF or JPEG files. But if you tried to identify relationships, correlations, or patterns between these diverse collections of data, it would be difficult. Relational databases help solve this problem, making analysis easier by creating a schema, which is the structural representation of data in a database. The data are compartmentalized in various tables, fields, rows, and columns, and the database creates relationships among the numerous data points. The database then can be queried to pull out links between elements such as columns and rows. One table may list all of a patient’s demographic details, including age, address, gender, and race, and a second may list their family medical history. By querying the database, you might be able to determine, for example, which female patients have a history of a specific disease because linkages between the demographics and family history tables are built into the tool.

Understanding the concept of distributed computing is another piece of the puzzle that can make Big Data less mysterious. In the simplest terms, distributed computing is a way for individual computers to talk to one another and function as one gigantic “brain” despite the fact that the machines may be located all over the world. The internet is a distributed computing network, connected by nodes, routers, and the like. Hadoop is another. It’s currently being used by data analysts to gain insights from amounts of data that are too massive to be stored economically in any one location.

An excerpt from Transform by Mayo Clinic experts Paul Cerrato, M.A., and John D. Halamka, M.D., M.S.

Relevant reading

Gifts of Her Spirit

During her 55-year career at Saint Marys Hospital in Rochester, Minn., Sister Mary Brigh Cassidy remained a common person with uncommon capabilities. Even though she worked tirelessly, she found time at the end of her busy days to write poetry and reflect on spirituality, nature and a life dedicated to…

Buy Now Shop Now

Discover more Innovation & Research content from articles, podcasts, to videos.

View Innovation & Research

The power of large numbers

The big data approach to healthcare

Privacy Policy