Chapter 7 · Agentic AI · Healthcare · The Future of Medicine

From Paper Charts to AI Agents:
How Epic Is Reinventing American Healthcare

Epic began in a Wisconsin basement in 1979 with one idea: patient information should follow the patient. Forty-five years later, that idea has grown into the infrastructure of American medicine — and is now the launchpad for AI that doesn’t just store information, but acts on it.

Company: Epic Industry: Healthcare IT Core concept: Agentic AI & the future of clinical care
Also in this chapter: Lab 7: Build an Agentic AI Workflow in Python →
MIS 432 · AI in Business · Case Study

From Paper Charts to AI Agents:
How Epic Is Reinventing American Healthcare

A forty-five year arc — from a basement database to the most ambitious agentic AI deployment in American medicine
Level: Upper-division undergraduate Topics: Agentic AI, ambient AI, EHR history, human-in-the-loop, prompt engineering, AI prototyping, future of healthcare AI Concepts introduced: 16 key business AI terms

Primary sources: HIMSS 2025 reporting on Epic’s agentic AI strategy (Fierce Healthcare); Microsoft DAX Copilot for Epic documentation; AMA physician burnout surveys 2023–2024; Epic UGM 2025 announcements; peer-reviewed research on EHR adoption and AI clinical outcomes.

In this chapter
1. The founding story: a patient dies, and a database is born 2. Twenty years of paper: what healthcare looked like before the EHR 3. The digital mandate: HITECH, Meaningful Use, and the great digitization 4. The unintended consequence: when digital created new burdens 5. The data foundation: Cosmos and why 280 million records matter 6. The AI era begins: from storage to intelligence 7. Ambient AI: DAX Copilot and the doctor-patient relationship 8. Agentic AI: when AI stops answering and starts acting 9. Epic’s agents: Art, Penny, and the pre-visit assistant 10. The AI Factory at Epic 11. Prompt engineering and AI prototyping in healthcare 12. The human-in-the-loop question 13. Four perspectives: how AI is changing healthcare for everyone 14. Beyond today: what comes next 15. Key vocabulary & discussion questions

1 The Founding Story: A Patient Dies, and a Database Is Born

In the mid-1970s, Gordon Faulkner, a pediatrician in Madison, Wisconsin, learned that one of his young patients had died. The child’s family had moved to Milwaukee, 75 miles away. When the child fell ill at a new hospital, the physicians there had no access to her medical history. They did not know her conditions. They did not know how to treat her. Gordon believed, quietly and with certainty, that if those doctors had had her records, she would have survived.

His wife, Judy Faulkner, was a graduate student in computer science at the University of Wisconsin. She had taken one of the first courses ever offered on computers in medicine, under a visionary professor named Warner Slack. The morning after hearing about the child, she went back to work with a new sense of urgency. She had already been building a database for tracking patient information over time. It was no longer just a research project. It was a mission.

The founding idea — 1979
“Patient information should follow the patient — not stay locked in a filing cabinet at the last doctor’s office they visited.”

In 1979, working on a single refrigerator-sized minicomputer in the basement of a house at 2020 University Avenue in Madison, Judy Faulkner co-founded a company called Human Services Computing with $70,000 in startup capital. The company would later be renamed Epic Systems. She has run it ever since, now in her eighties, overseeing a company of 14,000 employees on a 1,670-acre campus outside Madison.

Epic has never gone public. It has never taken outside investment after those early days. It has never made a significant acquisition. Faulkner runs it according to principles she calls “commandments” painted on walls throughout the campus. The first three: do not go public, do not acquire or be acquired, and software must work. This independence has allowed Epic to make long-term decisions that publicly traded companies cannot — including investing decades in data infrastructure that now makes its entire AI strategy possible.

2 Twenty Years of Paper: What Healthcare Looked Like Before the EHR

To understand Epic’s story, you have to understand what healthcare looked like before it. For most of the twentieth century, and well into the twenty-first, the medical record was a physical object: a manila folder stuffed with handwritten notes, carbon-copy prescription pads, typed discharge summaries, and stapled lab printouts. It lived in a filing cabinet at a single hospital or clinic. When a patient went to a different facility, their records did not go with them.

The practical consequences were severe. Physicians made decisions without a patient’s full history. Medications were prescribed without knowing what a patient was already taking elsewhere. Allergies discovered at one hospital were unknown at another. Redundant tests were ordered because no one knew the same tests had already been run. Specialists wrote letters to referring physicians that arrived weeks after the visit. Emergency physicians made life-or-death decisions with incomplete information as a matter of routine.

What paper looked like in practice
A typical patient encounter in 1995

A 67-year-old woman arrives at an emergency department in severe abdominal pain. She takes medications for heart disease but cannot remember their names. Her primary care doctor is in a different health system. His office won’t open for three hours. The hospital has no record of her previous visits because she was last hospitalized at a different facility five years ago.

The ER physician orders a full medication reconciliation from scratch. Blood is drawn for labs that were already done at her cardiologist’s office two weeks ago. She waits. The physician makes decisions based on what the patient can remember, supplemented by educated guesses. This is not a worst-case scenario. This is Tuesday.

Epic’s earliest products addressed this problem on a small scale through the 1980s and 1990s. But adoption was slow and expensive. Most hospitals remained on paper, or on fragmented electronic systems that could not communicate with each other. A patient’s information was still trapped — just now in a digital silo instead of a filing cabinet.

3 The Digital Mandate: HITECH, Meaningful Use, and the Great Digitization (2009–2015)

The moment that changed everything was not a technology breakthrough. It was a law. In February 2009, President Obama signed the American Recovery and Reinvestment Act. Buried inside it was the Health Information Technology for Economic and Clinical Health ActHITECH.

HITECH created a simple but powerful incentive structure: hospitals and physicians who adopted certified electronic health records and demonstrated “meaningful use” of them would receive substantial Medicare and Medicaid payments. Those who had not adopted EHRs by 2015 would face financial penalties. The federal government committed $27 billion to the program. The message to healthcare was unmistakable: digitize now, or lose money.

Key concept
Meaningful Use
Meaningful Use was the standard hospitals and physicians had to meet to qualify for HITECH incentive payments. It was not enough to buy an EHR — providers had to demonstrate actual clinical use: electronic prescribing, health information exchange with other providers, patient engagement through online portals, and tracking of clinical quality measures. Meaningful Use created a minimum floor for what EHRs had to do, driving standardization across an industry that had previously been a patchwork of incompatible systems. The program has since been renamed Promoting Interoperability, but its framework still governs EHR requirements today.

The results were dramatic. In 2008, fewer than 10% of hospitals had adopted basic EHR systems. By 2015, more than 80% had. Annual EHR adoption rates among eligible hospitals jumped from 3.2% per year before HITECH to 14.2% per year after. The U.S. healthcare system digitized faster, in a shorter time, than almost any comparable transformation in any other sector of the American economy.

Epic was the primary beneficiary. Its comprehensive integrated platform — covering clinical documentation, billing, lab ordering, pharmacy, and the patient portal (MyChart) — was exactly what hospital systems needed to meet Meaningful Use requirements quickly. Between 2009 and 2015, Epic went from serving hundreds of hospitals to thousands. Implementation wait lists stretched years into the future. The Verona campus expanded rapidly.

<10%
U.S. hospital EHR adoption in 2008
>80%
U.S. hospital EHR adoption by 2015
$27B
Federal HITECH investment
38%
Epic’s current U.S. hospital market share

4 The Unintended Consequence: When Digital Created New Burdens

The digitization of American healthcare solved the access problem. Patient information could now follow the patient — at least within systems using the same software. But it created a problem no one fully anticipated: documentation burden.

Paper records were incomplete and fragmented — but they were fast to write. A physician could scrawl a few lines of notes, sign the chart, and move on. Electronic health records were comprehensive and structured, but they required orders of magnitude more data entry. Every medication had to be selected from a dropdown. Every diagnosis had to be coded against a classification system. Every clinical decision had to be documented in a structured template. Every insurance interaction generated forms. The EHR, designed to make information more accessible, also made creating that information far more labor-intensive.

The numbers behind the crisis
By 2023, the American Medical Association found that 48% of physicians reported feeling burned out, with EHR documentation consistently identified as the leading cause. Physicians reported spending an average of nine minutes in the EHR for every 15 minutes spent with a patient. More than 20% reported spending over eight hours per week on the EHR outside of normal working hours — a phenomenon the medical community calls “pajama time.” Emergency department nurses spent approximately 27% of their shifts on EHR-related tasks — nearly the same proportion as direct patient care. The technology designed to improve healthcare was, in measurable and documented ways, degrading the experience of practicing it.

This created a paradox at the center of Epic’s dominance. Epic’s software was simultaneously the essential infrastructure of modern healthcare and a primary driver of clinician burnout. The company built on the mission of serving patients had built a system that was, in many cases, pulling clinicians away from patients. The instrument of care had become a barrier to caring.

👨‍⚕️   Physician perspective
“I went into medicine to take care of patients. By 2018, I was spending more time staring at a computer screen than looking at the person in front of me. I knew my patient’s name, but my eyes were on the dropdown menu for their medication. I had become a data entry specialist with a medical degree.” — Composite of reported experiences from AMA physician surveys, 2022–2024
👩‍⚕️   Nurse perspective
“We were hired to care for patients, but half my shift is clicking through screens. Medication administration requires multiple clicks to document each dose. Vital signs have to be entered in several places. I love my patients — but the EHR makes me feel like the patient is the computer, not the person in the bed.” — Composite of reported nursing experiences

This is the context into which Epic began deploying AI. The problem was not primarily a data problem. It was a human problem: how do you give clinicians back the time and attention that digitization had, unintentionally, taken from them?

5 The Data Foundation: Cosmos and Why 280 Million Records Matter

Before getting to the AI, we need to understand what made it possible: the data. What Epic built over forty-five years — before anyone was thinking about large language models or agentic systems — was the largest longitudinal clinical dataset ever assembled.

Cosmos is Epic’s de-identified clinical data platform. As health systems join Cosmos, their anonymized patient data flows into a shared research database. Today Cosmos contains records for more than 280 million patients, with over 16 billion clinical data points: diagnoses, medications, lab results, vital signs, imaging findings, clinical notes, procedures, and outcomes — accumulated over time, across multiple care settings, in standardized and structured format.

Key concept
Longitudinal clinical data
Most datasets capture a snapshot — what is true at a single moment. Longitudinal data captures a trajectory: what happened to this patient over months or years, across multiple encounters and care settings. This is what makes Cosmos uniquely powerful for AI training. A model trained on Cosmos can learn not just what a patient looks like at a single visit, but how patients with specific characteristics tend to progress over time — who improves, who deteriorates, what interventions make a difference, and what patterns predict complications before they appear. No dataset of clinical snapshots can teach a model the same thing, because medicine plays out over time.
Why a tech company cannot replicate this
Google Health, Microsoft, Amazon — all have tried to build clinical AI platforms. None has access to a dataset approaching Cosmos in scale or quality. The reason is structural: to build this data, you needed to be inside thousands of health systems for decades, earning their trust, managing their data under HIPAA, and operating within legal and contractual frameworks that take generations to establish. Epic has been doing this since 1979. The AI advantage is not in the algorithm. It is in the data that no check can purchase. A startup trying to replicate Cosmos’s longitudinal depth today would need not just money — it would need decades.

One concrete example: Epic’s Sepsis Prediction Model monitors every inpatient at participating hospitals every 15 minutes, recalculating each patient’s sepsis risk score using dozens of clinical variables updated in real time. A validation study found that after implementing the model, sepsis-related mortality at one hospital system declined by 44%. A separate AI sepsis surveillance tool found a 17% reduction in mortality. These are not incremental improvements. They are lives. And they are made possible by the data foundation that was built over decades before anyone was thinking about AI.

6 The AI Era Begins: From Data Storage to Intelligence

Epic’s AI journey did not start with large language models. It began years earlier with predictive models — statistical systems trained to find patterns in clinical data and alert clinicians to risk before it became crisis.

2008–2015 — The digitization era
From paper to pixels
HITECH drives mass EHR adoption. Epic grows to serve thousands of hospital systems. Patient data begins accumulating at scale for the first time. Cosmos is seeded. The infrastructure for AI is built without anyone yet thinking of it as AI infrastructure.
2015–2020 — The predictive era
AI as early warning system
Epic deploys machine learning models trained on Cosmos data. The Sepsis Model, deterioration index, and readmission risk tools run in the background of every Epic installation, alerting nurses and physicians when risk crosses a threshold. These are narrow AI tools — each trained for one task, each requiring human response to every alert.
2020–2023 — The generative era
AI begins to write
Microsoft acquires Nuance Communications for $19.7 billion, primarily for its DAX ambient documentation technology. Epic integrates Azure OpenAI Service. The first generative AI features appear: automated drafts of responses to patient messages, AI chart summarization. Ambient clinical documentation enters early deployment at pilot sites.
2024–present — The agentic era
AI begins to act
DAX Copilot becomes generally available embedded in Epic, deploying to 150+ health systems. Epic announces agentic AI agents — Art, Penny, pre-visit assistants — at HIMSS 2025 and UGM 2025. The direction shifts from AI that responds to queries to AI that pursues goals, plans multi-step workflows, and acts under human oversight.

Each phase built on the previous one. Predictive models required Cosmos. Generative AI required the EHR integrations that Meaningful Use compliance had forced hospitals to build. Agentic AI requires the institutional trust that years of predictive and generative AI deployment have established. This is not a sudden transformation. It is the culmination of a forty-five-year compounding investment.

7 Ambient AI: DAX Copilot and the Doctor-Patient Relationship

The most immediately impactful AI deployment Epic has made is also the most human: a system designed to give doctors back the ability to look their patients in the eye.

DAX Copilot (Dragon Ambient eXperience Copilot) is built through a partnership between Epic, Microsoft, and Nuance. A physician opens Epic on a tablet, taps a button to begin a visit, and DAX Copilot activates. It listens to the conversation between the physician and patient — not recording a transcript, but understanding the clinical meaning of what is being said in real time, using large language models fine-tuned on clinical language. When the visit ends, DAX Copilot produces a complete, structured draft clinical note within seconds. The physician reads it, edits anything that needs correction, and approves it for the medical record. Documentation is done before the patient leaves the room.

Key concept
Ambient AI
Ambient AI refers to artificial intelligence that operates passively in the background of an environment, gathering information and generating outputs without requiring explicit prompts or commands from the user. The physician does not type a query. They do not dictate notes. They simply conduct the visit naturally, and the AI observes, understands, and acts. The word “ambient” is important: it means the AI is present without demanding attention — woven into the environment rather than imposed on it. This is the opposite of the EHR workflow it is replacing, which demanded physician attention constantly: menus to navigate, fields to fill, codes to select. Ambient AI asks for nothing from the clinician during the visit. It gives something back afterward: time.

The measured results are striking. Clinicians using DAX report a 50% reduction in documentation time, 70% reduction in feelings of burnout and fatigue, an average of seven minutes saved per patient encounter, and the ability to see an additional five patients per day. At Northwestern Medicine, physicians using DAX in at least half their encounters were seeing an additional 11.3 patients per month. At WellSpan Health, 94% of physicians reported that DAX improved the quality of their patient interactions.

But the most important finding is not in the productivity data. When physicians are not typing, they look at their patients. That shift — from screen to face — changes the encounter in ways that are measurable in patient satisfaction scores but felt most directly in the examination room.

👨‍⚕️   Physician using DAX Copilot
“For the first time in years, I was actually present. Not multitasking between the patient and the keyboard. I could make eye contact. I could listen to what they were actually saying, not just what I needed to document. One of my patients asked me what was different. I told her I got a little help with the paperwork. She said, ‘You seem like yourself again.’”
🧑‍🦳   Patient perspective
“I used to feel like the doctor was only half listening — typing the whole time. Now they look at me. They ask follow-up questions. I feel like I’m talking to a human being again, not filling out a form. I don’t know what changed, but something did.” — Composite of patient experience reports from DAX deployment studies

In late 2025, Microsoft extended ambient AI to nurses with Dragon Copilot for nurses — designed specifically around nursing documentation workflows, which historically had been poorly served by existing AI tools. Emergency department nurses who spend roughly a quarter of their shift on EHR tasks are primary beneficiaries. The principle is the same: capture documentation passively while the nurse cares for the patient, rather than demanding documentation at the expense of care.

👩‍⚕️   Nurse perspective
“Nursing documentation was never designed around how nurses actually work. We move between patients constantly. We’re assessing, mediating, coordinating — and then expected to document everything at the end of a 12-hour shift from memory. AI that captures what I’m doing as I’m doing it isn’t just efficiency. That’s accuracy. That’s patient safety.” — Composite of reported nursing experiences

8 Agentic AI: When AI Stops Answering and Starts Acting

Ambient AI listens and generates. It produces a draft note that a human reviews. It is powerful, but it is still a tool that responds to input and requires human action to produce an outcome in the medical record. Agentic AI is categorically different.

At HIMSS 2025, Epic’s VP of R&D Seth Howard described the shift directly: “We’ve woven AI into the foundational capabilities of Epic, and we’ve been working towards an agentic platform for the past year or so. We’re really building on the foundation that we created to have generative AI as part of the software to start building reusable components that can take action, under human oversight.”

Key concept
Agentic AI
An agentic AI system is given a goal and autonomously plans and executes a sequence of actions to achieve it — without requiring a human to specify each step. A standard AI responds to a single prompt with a single output: you ask, it answers. An agentic AI receives an objective, breaks it into subtasks, uses tools (databases, APIs, messaging systems, scheduling platforms) to accomplish those subtasks, evaluates its own progress, and adjusts its approach based on what it finds. In healthcare, a generative AI writes a note when prompted. An agentic AI might review a patient’s chart, identify a missed diabetes screening, draft a personalized outreach message, suggest scheduling a follow-up, notify the care coordinator, and flag the case for nursing review — all without being asked to do any of those steps individually. The distinction is autonomy over a multi-step process, not just the generation of a single response.

This distinction matters enormously in healthcare, because the stakes of autonomous action are unlike almost any other domain. Every previous chapter in this book introduced AI that makes predictions or recommendations. Uber’s algorithm predicts demand and sets prices automatically. Spotify’s system personalizes playlists. The prediction and the action are tightly linked. Agentic AI is different because it can pursue a complex, multi-step goal that requires reasoning, tool use, and adaptation — not just a single algorithmic output.

And that is precisely why the phrase “under human oversight” in Howard’s statement is not a legal disclaimer. It is the central engineering and governance question of Epic’s entire AI program. The capability of the AI is not the binding constraint. The question is: at what level of verified accuracy, and for which tasks, is it safe to let an AI act without a human checking every output?

9 Epic’s Agents: Art, Penny, and the Pre-Visit Assistant

Epic is not building one agentic AI. It is building a fleet of specialized agents, each designed for a specific clinical or administrative workflow. Think of them less as a single AI assistant and more as a team of invisible staff members — each with a defined role, operating within defined constraints, always surfacing their work to a human before it takes effect.

Art: The Clinical Intelligence Agent

Art is Epic’s clinician-facing AI agent. Before a physician walks into an exam room, Art prepares them. It reviews the patient’s entire clinical history in Epic, synthesizes the key information, and generates a concise pre-visit brief: the most relevant diagnoses, medication changes since the last visit, lab values trending in the wrong direction, and care gaps that should be addressed during the encounter. During complex cases, Art can search across the 16+ billion data points in Cosmos to surface comparable patients and what happened to them, answering questions like “What do patients with this combination of findings typically respond to?”

Art in practice
The two-minute pre-brief that used to take ten

A primary care physician hasn’t seen a patient in 18 months. Without Art, she would spend the first several minutes of the appointment scrolling through the EHR, reconstructing what had happened since the last visit: what medications changed, what specialist notes arrived, what test results need follow-up. It is not unusual for this to consume a third of a 20-minute appointment.

With Art, she opens the chart and reads a focused summary: two chronic conditions, one medication change made by cardiology, a rising creatinine trend nobody has acted on, and a mammogram that is two years overdue. She walks into the room already knowing what matters. The patient gets more care in the same amount of time.

Penny: The Revenue Cycle Agent

Penny addresses the part of healthcare patients rarely see: billing and insurance. When an insurance company denies a claim — which happens approximately 17% of the time for commercial insurers — a human staff member traditionally researches the denial, pulls together clinical justification, and writes an appeal letter from scratch. This process takes 45 minutes or more, and many denials go unchallenged simply because the labor cost of fighting them exceeds their financial value.

Penny does this autonomously: it reads the denial, retrieves the relevant clinical documentation from the EHR, identifies the applicable policy language, drafts the appeal letter, and presents it to an administrator for review and submission. The human approves or edits. The whole process takes minutes instead of hours — and the threshold for which denials are worth fighting changes entirely.

Why the administrative agent matters for patient care
Insurance claim denials cost U.S. health systems an estimated $19.7 billion per year in processing costs. When the cost of an appeal drops from 45 minutes of skilled staff time to 3 minutes of review, health systems can fight denials they previously let go. That recovered revenue funds nurses, equipment, and expanded services. Agentic AI in the administrative back office has a direct line to what is possible at the bedside. This is not an abstract connection — it is the economics of healthcare delivery.

The Pre-Visit Patient Assistant

Epic is also developing conversational AI agents that interact with patients before they arrive. These agents contact patients ahead of appointments, ask about the goals of the visit, gather information about any symptoms that have changed, confirm which medications they are still taking, and identify whether any prerequisite tests should be scheduled. They summarize this for both the patient (in MyChart) and the physician (in the pre-visit brief). The patient arrives more prepared. The physician arrives more informed. The visit is more productive before it begins.

🧑‍🦳   Patient using the pre-visit assistant
“The app messaged me a few days before my appointment asking what I wanted to talk about. I always forget one of the three things I’m worried about once I’m sitting in the exam room. This time, my doctor already knew what was on my mind when she walked in. We didn’t waste any time. We just talked about what mattered.”

10 The AI Factory at Epic

The AI Factory model we introduced in Chapter 1 — data feeding models, models generating predictions, predictions informing decisions, decisions creating value that generates new data — maps directly onto Epic’s AI architecture. With one important difference from every previous chapter: at Epic, the loop includes patients, and the value being created is not just commercial. It is clinical.

1. Data
280M patients,
16B+ clinical data points
2. Model
GPT-4 + Cosmos fine-tuning
+ domain-specific models
3. Prediction
Draft notes, risk scores,
care gaps, appeal letters
4. Decision
Clinician reviews
and approves
5. Value
Better care → new data
→ better models

Notice step 4. Unlike Uber, which closed the prediction-decision gap entirely, Epic has — by design, and for now — kept the human inside that gap for all clinical decisions. Every AI-generated note, every flagged risk alert, every care gap suggestion passes through a clinician before it becomes part of a patient’s official record or triggers any action. This is the same approach Spotify takes with its “algotorial” model: AI does the fast, scalable work; humans approve the consequential outputs.

AI Factory StepWhat Epic doesBusiness lessonKey concept
Data Every clinical encounter generates structured and unstructured data flowing into Cosmos — the largest longitudinal clinical dataset ever assembled Healthcare data is uniquely irreplaceable — built over decades through trusted relationships no competitor can fast-follow Data moat; longitudinal clinical data
Model Azure OpenAI (GPT-4 and successors) fine-tuned on Cosmos clinical data, plus domain-specific models for sepsis prediction, deterioration risk, and readmission probability General foundation models need domain fine-tuning to perform in specialized fields — clinical expertise lives in what the model is trained on, not just its architecture Fine-tuning; domain adaptation; foundation models
Prediction Ambient notes (DAX), risk alerts (sepsis, deterioration), care gap identification, insurance appeal drafts, chart summaries, pre-visit briefings, patient outreach messages In healthcare, AI predictions are outputs to be reviewed, not decisions to be automated — the prediction-decision gap is preserved by design, not by oversight failure Ambient AI; predictive models; agentic AI
Decision Physicians review and approve AI notes; nurses respond to or dismiss risk alerts; administrators approve appeal letters; clinicians act on care gap notifications Keeping humans in the decision loop is both ethically correct and strategically necessary for adoption — clinician trust determines deployment success more than technical accuracy Human-in-the-loop; augmentation vs. automation
Value Reduced burnout, more time with patients, earlier sepsis detection, recovered insurance revenue, fewer preventable readmissions Healthcare AI value is measured in clinician time returned and patient outcomes improved — not engagement metrics or click-through rates Augmentation ROI; clinical efficiency
Loop back Every physician edit of an AI note, every accepted or rejected risk alert, every corrected suggestion becomes training signal that improves the next model version Expert corrections in high-stakes domains are extraordinarily valuable training data — the humans in the loop are also teaching the model how to improve RLHF; continuous improvement; compounding advantage

11 Prompt Engineering and AI Prototyping in Healthcare

Two concepts introduced elsewhere in this course take on particular importance in healthcare, where the stakes of AI outputs are highest.

Prompt Engineering as a Safety Discipline

Prompt engineering is the practice of deliberately designing the inputs given to an AI system to produce reliable, accurate, and safe outputs. In consumer applications, prompt engineering is often casual — you experiment with different phrasings until you get a useful result. In clinical AI, prompt engineering is an engineering discipline with patient safety implications, and it is treated accordingly.

Key concept
Prompt engineering in high-stakes AI
When Epic deploys a generative AI feature — like automated drafts of responses to patient messages in the physician’s EHR inbox — the prompts given to the underlying model are not casual suggestions. They are carefully engineered instructions that specify the model’s role, the format and length of its output, the clinical constraints on what it should and should not say, and the signals that should cause it to defer to a human clinician rather than generate a response. These prompts are developed in collaboration with clinicians, ethicists, and patient safety teams, tested against thousands of real and synthetic cases, and governed like any other medical device configuration. They are versioned, audited, and updated when outputs reveal problems. In clinical AI, “just tweak the prompt” is not a debugging strategy. It is an engineering process with safety review at every step.

AI Prototyping: From Clinical Problem to Working Solution in Days

One of the most significant shifts in how Epic and its health system partners build new features is the rise of AI prototyping — using AI tools to produce working drafts of features and workflows in hours or days rather than months.

Key concept
AI prototyping
AI prototyping refers to using AI tools — typically large language models accessed through natural language — to produce working drafts of products, features, interfaces, or workflows rapidly, dramatically reducing the cost and time of making an idea tangible enough to test and evaluate. In healthcare IT, a clinical workflow that previously required months of requirements gathering, design, engineering, and QA can now be rough-prototyped in days by a small team using AI-assisted development tools. This does not eliminate the validation, clinical testing, and safety review that medical software requires. But it changes what ideas are worth exploring, because the cost of exploration has collapsed. And it changes who can meaningfully participate in that exploration: a nurse who identifies a documentation problem can now describe a potential solution in plain language, use AI to build a prototype, and show it to engineers — rather than writing a formal requirements document that may be read months later.

This is making healthcare software development more responsive to clinicians — which, historically, has been one of its most persistent and costly failures. EHRs were designed by engineers, for workflows that engineers imagined clinicians had. AI prototyping, combined with ambient AI tools that make clinical processes more visible to software developers, is beginning to close that gap.

12 The Human-in-the-Loop Question

Every AI system in this chapter keeps a human in the loop at the decision stage. This is intentional — and it will not stay this way forever. Understanding why this design choice was made, and how it will evolve, is one of the most important things this chapter can teach.

Key concept
Human-in-the-loop vs. human-on-the-loop
These terms describe two different relationships between humans and AI systems. Human-in-the-loop: a human must actively review and approve each AI output before it takes effect. The physician who reads the DAX note before it enters the chart is in-the-loop. Human-on-the-loop: the AI acts autonomously while a human monitors and can intervene. An administrator who reviews a dashboard of AI-submitted insurance appeals rather than approving each letter individually is on-the-loop. As AI systems demonstrate sustained accuracy across millions of interactions, and as regulatory frameworks and institutional trust develop, organizations tend to move from in-the-loop to on-the-loop for specific tasks. The key question is not whether this transition will happen, but for which tasks, at what demonstrated accuracy level, and with what governance structures it is safe to make.
The stakes of getting this wrong
Healthcare AI errors are unlike errors in other domains. If Spotify recommends a song you dislike, you skip it. If an agentic AI misfires a medication dosage, misreads an allergy, or fails to escalate a deteriorating patient, the consequences can be permanent and irreversible. This asymmetry — the catastrophically high cost of certain AI errors in medicine versus virtually every other industry — is why the movement from human-in-the-loop to human-on-the-loop in healthcare is much slower, much more evidence-dependent, and much more heavily regulated than in other sectors. The question is not whether AI can be accurate. The question is: how accurate does it need to be, validated across how many patients, before we stop verifying every output it produces? And who is legally and ethically responsible when that calculation proves wrong?

13 Four Perspectives: How AI Is Changing Healthcare for Everyone

AI in healthcare is not a monolithic experience. It looks different depending on where you sit in the system. Here is what the current transformation means for each major stakeholder — including what each group gains and what each group risks.

Physicians

For physicians, the AI era represents the first genuine attempt to reverse the burnout crisis that EHR adoption created. Ambient documentation is returning time that was stolen by the keyboard. Predictive models are adding a layer of continuous pattern recognition that no individual physician can maintain across dozens of simultaneous patients. Pre-visit briefing agents are making it possible to be fully prepared for complex patients without spending a third of the appointment catching up on the record.

The risk is different: alert fatigue. When AI systems generate too many notifications — too many risk scores, care gap flags, suggested actions — physicians stop reading them. The sepsis prediction model has been studied at multiple hospitals where clinicians began dismissing alerts reflexively because they arrived too frequently. The benefit of the model was neutralized by its own volume. Epic’s challenge is not just building AI that is accurate. It is building AI whose alerts physicians will actually act on — which requires calibration, context, and understanding of clinical workflow that purely technical optimization cannot provide.

Nurses

Nurses have historically been the most underserved group in health IT. EHR systems were designed primarily around physician workflows. Nursing documentation was often retrofitted into tools that did not fit how nurses actually work: in motion, across multiple patients simultaneously, with constant interruption. The extension of ambient AI to nursing workflows, the development of AI-assisted flowsheet population, and use of AI to reduce end-of-shift documentation backlogs are meaningful steps toward healthcare AI that serves the people who provide most direct patient care.

Nurses are also among the most cautious about AI accuracy in specific contexts. A physician reviewing an AI-generated note can catch errors through clinical expertise. In some nursing workflows, the safety redundancies are fewer. Trust calibration matters enormously, and it must be earned task by task, not assumed.

Patients

For patients, the most visible change is often the most subtle: physicians who look at them again. The return of presence — eye contact, genuine listening, follow-up questions — is the human dividend of AI absorbing the documentation burden. Patients do not see DAX Copilot. They see what it gives back.

Less visible is the AI acting on their behalf when they are not in the room: the sepsis model that flags their deterioration at 3am, the care gap alert that prompts their physician to order an overdue mammogram, the pre-visit assistant that ensures the appointment addresses what actually matters to them. These interventions are invisible but may determine whether they receive the right care at the right time — or don’t.

🧑‍🦳   Patient perspective on the invisible AI
“I’ve been coming to this clinic for ten years. Last month my doctor brought up a test I hadn’t had in two years. I didn’t even know I was overdue for it. She caught it. I don’t know if it was her or the computer that remembered — but something did.”

Healthcare Administrators and the System

For hospital administrators, AI represents a dual opportunity: reduce costs through automation of administrative workflows, and increase revenue through systematic recovery of denied claims and improved physician throughput. For the broader healthcare system, AI that reduces unnecessary tests and hospital readmissions is a societal benefit — but AI that helps hospitals fight more insurance denials is a direct cost to payers. The economic dynamics are not zero-sum across the whole system, but they create real tensions between stakeholders in specific transactions.

14 Beyond Today: What Comes Next in Healthcare AI

The agentic AI Epic is deploying in 2024 and 2025 is closer to the beginning of a long arc than to its endpoint. Here is where the evidence and the trajectory of investment suggest healthcare AI is heading.

Near term (2025–2027): Greater autonomy, task by task

The current generation of Epic’s agents operates in human-in-the-loop mode for clinical decisions. As confidence builds through validation studies and outcome data, lower-stakes tasks will shift to human-on-the-loop operation. Insurance appeal letters may be submitted automatically, with humans reviewing exceptions. Routine preventive outreach may become fully automated. The shift will happen task by task, evidence by evidence, not as a single policy change.

Near-term projection (2025–2027)
Systematic care gap closure at scale: Within two to three years, health systems using Epic are likely to have fully agentic workflows that proactively identify patients overdue for preventive screenings, draft and send personalized outreach, suggest appointment scheduling, and follow up with non-responders — all without any human initiating each individual action. For a primary care practice with 2,000 patients, this kind of systematic proactive outreach has been operationally impossible. Agentic AI will make it routine. The patients who benefit most will be those who were too busy, too intimidated, or too isolated to reach out themselves.

Medium term (2027–2032): Multimodal AI and continuous monitoring

Epic’s leadership has explicitly described a roadmap toward native multimodal AI capabilities — processing not just text but video, images, and genomic data. This opens the door to AI-assisted diagnostics that go beyond pattern-matching in text records. An AI agent that can analyze a dermatology photograph, a radiology scan, and the patient’s full clinical history simultaneously — and surface a differential diagnosis for a physician to review — is within technical reach.

Medium-term projection (2027–2032)
The hospital-at-home model, powered by continuous AI surveillance: Wearable devices already generate continuous streams of physiological data. As integration between wearables and Epic improves, agentic AI will monitor this stream for clinical signals, alerting care teams to changes before they become emergencies. Patients recovering from surgery, managing chronic diseases, or at elevated risk will receive effective clinical monitoring from home. The hospital-at-home model is already in early deployment and will expand substantially — shifting care from a location-based model to a continuous, data-driven one.

Longer term (2032+): AI as a structural participant in care delivery

The U.S. faces a projected shortage of 86,000 physicians by 2036. AI cannot fill that gap by being a better search engine. But AI agents that handle routine chronic disease monitoring, medication refill management, follow-up on stable conditions, and patient question triage — while escalating anything requiring physician judgment — could meaningfully extend the productive capacity of each physician. This is not a future in which AI replaces doctors. It is a future in which AI makes it possible to extend physician-level oversight to populations who currently have very limited access to it.

Longer-term projection (2032+)
The AI primary care layer: In underserved communities, rural areas, and developing nations, physician access is already deeply constrained. AI agents trained on Cosmos-scale data and validated against millions of clinical outcomes could extend a meaningful layer of evidence-based care to populations that currently lack access to any specialist and see a primary care physician rarely if ever. This is healthcare AI’s most ambitious potential contribution — not making existing care more efficient, but making care accessible where it currently does not exist.
The risks that accompany the progress
Every expansion of AI autonomy in healthcare carries risks the field is still learning to manage. AI bias: models trained on historical data perpetuate the disparities embedded in that data, potentially producing worse predictions for underrepresented populations. Errors at scale: a bug in an algorithm deployed to 3,600 hospitals could affect millions of patients before detection. Regulatory lag: the FDA is still developing frameworks for governing agentic AI as a medical device. Liability: when an AI agent causes harm, who is responsible — the hospital, Epic, Microsoft, or the physician who trusted the output? These are not hypothetical questions. They are active legal and policy debates that will shape the pace and architecture of healthcare AI for years to come.

15 Key Vocabulary & Discussion Questions

Key vocabulary introduced in this chapter

Agentic AI
AI given a goal that autonomously plans and executes multi-step processes — qualitatively different from systems that respond to a single prompt with a single output
Ambient AI
AI that operates passively in the background, capturing and processing information without explicit commands — DAX Copilot listens to clinical conversations without the physician typing a word
Electronic Health Record (EHR)
A digital system for storing and managing patient medical information across encounters, providers, and settings — the platform on which all healthcare AI is being built
HITECH Act / Meaningful Use
The 2009 U.S. law that drove the rapid digitization of American healthcare through financial incentives and penalties — the policy decision that created the data foundation for healthcare AI
Pajama time
The informal term for after-hours documentation work done by physicians at home — a measurable indicator of EHR administrative burden and a primary target of ambient AI solutions
Longitudinal clinical data
Patient data captured across time and multiple care settings — what teaches AI models how diseases progress and what interventions work, rather than snapshots of isolated moments
Cosmos
Epic’s de-identified clinical data platform: 280+ million patients, 16+ billion data points — the data moat that makes Epic’s AI strategy structurally difficult for competitors to replicate
Fine-tuning
Taking a pre-trained foundation model and continuing to train it on domain-specific data — how Epic adapts GPT-4 to understand the specialized language and reasoning patterns of clinical medicine
Prompt engineering
The systematic design and testing of AI inputs to produce reliable, accurate outputs at scale — in clinical AI, a safety-critical engineering discipline governed like any other medical device configuration
AI prototyping
Using AI tools to produce working drafts of features in hours or days — changing who can participate in healthcare software development and accelerating the feedback loop between clinical need and technical solution
Human-in-the-loop
A system design requiring a human to review and approve each AI output before it takes effect — Epic’s current standard for all clinical decisions
Human-on-the-loop
A system design where AI acts autonomously while humans monitor and can intervene — the direction specific healthcare AI tasks are moving as accuracy is validated over time
Alert fatigue
When clinicians begin ignoring AI alerts because they are too frequent or poorly calibrated — one of the most significant barriers to effective clinical AI deployment, independent of the underlying model’s accuracy
Data moat
A proprietary dataset so large, domain-specific, and difficult to replicate that it creates structural competitive advantage — Cosmos is arguably the most powerful data moat in enterprise healthcare
RLHF (Reinforcement Learning from Human Feedback)
A training technique using human expert evaluations to improve AI models — every physician edit of an AI-generated clinical note is a learning signal that can make the next note better
Platform consolidation
An incumbent platform internalizing capabilities that third-party vendors built on its ecosystem — Epic embedding ambient documentation removes the need for separate vendor contracts and shifts competitive dynamics

Discussion questions

Eight questions anchored in the themes of this chapter. These work equally well as written assignments or in-class discussion. Questions 2, 5, and 7 tend to generate the most debate.

  1. The unintended consequence (Sections 4 & 7). The HITECH Act drove the digitization of American healthcare and created the data foundation that now makes AI possible — but it also created the burnout crisis that AI is now trying to solve. Is this a story of good policy with unintended consequences, or a failure of policy design to anticipate human factors? What would you have done differently in 2009? Does your answer change knowing that without HITECH, the data infrastructure for today’s clinical AI would not exist?
  2. Whose problem is AI solving? (Sections 4, 7, 13). This chapter presents ambient AI as a solution to clinician burnout. But hospitals also benefit when physicians see more patients per day. Epic earns revenue. Insurance companies may pay more claims when Penny helps hospitals fight more denials. Is Epic reducing burnout as an act of mission — or as a business strategy that happens to benefit clinicians? Can it be both? Does the motivation matter if the outcome is the same?
  3. The Uber comparison (Section 10). Uber closed the prediction-decision gap entirely — predictions trigger automated decisions at machine speed. Epic has deliberately preserved the gap for clinical decisions. Both made rational choices for their contexts. Pick one specific Epic AI feature (Art, Penny, DAX, the sepsis model, or the pre-visit assistant) and argue for closing its prediction-decision gap entirely — letting it act without human review. Now argue against. What specific evidence would move you from one position to the other?
  4. Alert fatigue and the signal problem (Section 13). Epic’s sepsis model at some hospitals generated so many alerts that clinicians began dismissing them without reading them — neutralizing the model’s benefit entirely. This is not a model accuracy failure. It is a workflow integration failure. If you were designing a clinical AI alert system, what would you change? Be specific: what would you measure, what thresholds would you set, how would you know if you got it right, and what happens if you got it wrong?
  5. The data moat and equity (Section 5). Cosmos is built primarily from data generated by patients at large, well-resourced U.S. health systems. Rural patients, uninsured patients, and patients at safety-net hospitals are underrepresented. If AI models are trained on this data, how might that shape their predictions for patients who look less like the training population? Is this a problem Epic can solve? Who is responsible for solving it if Epic cannot or will not?
  6. Prompt engineering as safety engineering (Section 11). This chapter argues that prompt engineering in clinical AI is a safety-critical engineering discipline, not casual experimentation. Design a governance framework for how a hospital should manage the prompts used in its clinical AI systems. Who writes them? Who reviews and approves them? How often are they updated? What happens when a prompt produces a harmful output? Use the principles of this chapter to defend your framework.
  7. The liability question (Section 14). An AI agent drafts a clinical note containing a dosage error. The physician reviews it, does not catch the mistake, and approves it. The patient is harmed. Who is legally and ethically responsible — the physician, the hospital, Epic, or Microsoft? Now change the scenario: the physician was reviewing 40 AI-generated notes that morning and had four minutes per note. Does that change your answer? Should it change how AI-generated clinical content is governed?
  8. Your career in the AI healthcare era (Section 14). The physician shortage and the expansion of healthcare AI are happening simultaneously. AI agents will increasingly extend clinician capacity, not replace it. Pick a healthcare-adjacent career that interests you — health system administration, health insurance, pharmaceuticals, public health, consulting, or health IT. Identify one specific workflow in that field where agentic AI is likely to be deployed within five years. Describe what the agent does, where the human checkpoint sits, and what specific skills would make you the person who works effectively with that agent — rather than the person whose role the agent has replaced.

MIS 432 · AI in Business · Case Study · For classroom discussion purposes.

← Chapter 6: Airbnb Lab 7: Build an Agentic AI Workflow →