In this lab you will move from asking AI a question to building an AI that takes a sequence of actions — experiencing firsthand the difference between a chatbot and an agent, and designing the human-in-the-loop safeguards that make agentic AI safe enough to deploy.
This lab has three parts that each deepen your understanding of what "agentic AI" actually means in practice — not just as a concept from the chapter, but as something you can observe, test, and evaluate yourself.
Before building anything, you need a clear mental model of the difference. A chatbot answers a question — one input, one output, done. An agent pursues a goal — it takes a series of steps, uses tools, checks its own work, and adapts when something doesn't go as planned. The difference is not a matter of how smart the AI is. It is a matter of how it is designed.
For each of Epic's three agents from Chapter 7, fill in the table. Use your own words — the goal is to make these concepts concrete, not to reproduce the chapter.
| Agent | What goal is it pursuing? | What steps does it take? | What tools does it use? | Where is the human in the loop? |
|---|---|---|---|---|
| Art Clinical intelligence |
Your answer | Your answer | Your answer | Your answer |
| Penny Insurance appeals |
Your answer | Your answer | Your answer | Your answer |
| Pre-visit assistant Patient preparation |
Your answer | Your answer | Your answer | Your answer |
Penny replaces a process where an administrator asked a chatbot "help me write an appeal for this denial" and got back a generic draft. How is the agentic version of Penny different from that? Write 3–4 sentences identifying the specific differences in what the system does — not just in how good the output is.
You are going to build a simplified version of Penny — the agent that reads an insurance denial, pulls together the relevant clinical context, and drafts an appeal letter. The real Penny connects to Epic's EHR and retrieves actual patient records. Your version will work with realistic fictional patient data you provide directly. The underlying logic — read the denial, reason about it, draft a response — is the same.
More importantly: you are going to build it in stages, deliberately. First as a simple single-prompt chatbot. Then as a multi-step agent. The difference between those two versions is the entire point of Chapter 7.
!pip install anthropic --quiet import anthropic import json # Paste your API key here — keep this notebook private API_KEY = "your-api-key-here" client = anthropic.Anthropic(api_key=API_KEY) print("✓ Anthropic client ready")
First, build the naive version — one prompt, one response. This is what existed before agents. Notice what it can and can't do.
# The insurance denial we need to appeal denial = """ INSURANCE DENIAL NOTICE Patient: Maria Chen, DOB 03/14/1958 Claim #: HC-2024-88421 Service: Continuous Glucose Monitor (CGM) device + supplies Denial reason: Not medically necessary. Patient does not meet criteria for CGM coverage under policy section 4.2.1. Criteria requires: Type 1 diabetes OR insulin-dependent Type 2 diabetes with documented hypoglycemic episodes. """ # Patient clinical context (in a real system, Penny retrieves this from the EHR) patient_context = """ Patient: Maria Chen Diagnosis: Type 2 Diabetes Mellitus (E11.9), diagnosed 2019 Current medications: Metformin 1000mg BID, Glipizide 10mg daily (Glipizide is a sulfonylurea — a class known to cause hypoglycemia) Recent labs: HbA1c 8.4% (elevated), last 3 months Recent notes: Patient reported two episodes of dizziness and confusion in past month; blood glucose 58 mg/dL on one occasion. Physician ordered CGM to monitor for hypoglycemic patterns. """ # VERSION A: Single prompt — just ask the AI to write the appeal response_a = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1000, messages=[{ "role": "user", "content": f"""Write an insurance appeal letter for this denial. Denial: {denial} Patient context: {patient_context} Write a professional appeal letter.""" }] ) print("=== VERSION A: CHATBOT APPROACH ===") print(response_a.content[0].text)
Now build the same task as an agent — three distinct steps, each building on the last. This is how Penny actually works.
def call_claude(prompt, system=None): """Helper: send a prompt, return the text response.""" messages = [{"role": "user", "content": prompt}] kwargs = {"model": "claude-sonnet-4-20250514", "max_tokens": 1000, "messages": messages} if system: kwargs["system"] = system response = client.messages.create(**kwargs) return response.content[0].text print("=== VERSION B: AGENTIC APPROACH ===") print("Running 3-step agent pipeline...\n") # STEP 1: Analyze the denial — identify the specific criteria gap print("Step 1: Analyzing the denial...") denial_analysis = call_claude( f"""You are an expert in insurance policy and medical billing. Analyze this insurance denial carefully: {denial} Identify: 1. The exact policy criteria that triggered the denial 2. The specific gap between what the insurer requires and what they know about the patient 3. What clinical evidence, if present, would override this denial Be specific and cite the policy section mentioned.""", system="You are a medical billing specialist with deep expertise in insurance policy language." ) print("✓ Denial analysis complete\n") print(denial_analysis) print("\n" + "="*50 + "\n") # STEP 2: Match clinical evidence to policy requirements print("Step 2: Matching clinical evidence to policy...") evidence_match = call_claude( f"""You are a medical billing specialist. The denial analysis identified these requirements: {denial_analysis} The patient's clinical record contains: {patient_context} Identify: 1. Which specific clinical facts in the record satisfy the insurer's criteria 2. Which policy language the clinical facts speak to directly 3. Any clinical details that strengthen the medical necessity argument Be precise — quote the relevant clinical facts and map them to specific criteria.""", system="You are a medical billing specialist who builds airtight insurance appeals." ) print("✓ Evidence mapping complete\n") print(evidence_match) print("\n" + "="*50 + "\n") # STEP 3: Draft the appeal using the structured analysis print("Step 3: Drafting the appeal letter...") appeal_letter = call_claude( f"""You are drafting a formal insurance appeal letter. Use this analysis of the denial: {denial_analysis} And this mapping of clinical evidence to policy criteria: {evidence_match} Write a formal, professional appeal letter that: - Cites the specific policy section and explains why the denial is incorrect - References the clinical evidence that satisfies the insurer's criteria - Is factual, specific, and concise — no generic language - Ends with a clear request for reconsideration Format it as a real letter ready for submission.""", system="You are a medical billing specialist drafting a formal insurance appeal. Be precise, factual, and professional." ) print("✓ Appeal letter drafted\n") print("=== FINAL APPEAL LETTER ===") print(appeal_letter)
| Question | Version A (chatbot) | Version B (agent) |
|---|---|---|
| Does it identify the specific policy criteria? | Your observation | Your observation |
| Does it cite specific clinical facts from the record? | Your observation | Your observation |
| Would a billing specialist need to rewrite it before submitting? | Your observation | Your observation |
| Could it handle a more complex denial with multiple criteria? | Your observation | Your observation |
Modify the denial and patient context and run the agent again. Try a different type of denial — a medication, a procedure, or a specialist referral. Observe how the agent adapts its analysis to the new scenario.
You have just built an agent that can analyze a denial, match evidence to policy, and draft an appeal — all without human involvement. The agent works. Now the harder question: should it work without human involvement? At every step? Or only some steps?
This is the human-in-the-loop question from Chapter 7, Section 12 — and it does not have one right answer. It depends on accuracy, stakes, accountability, and trust built over time. Your job in this part is to think through it seriously for the agent you just built.
For each of the three steps in your Version B agent, assess the risk of the AI getting it wrong — and the consequence if it does.
| Agent step | What could go wrong? | Consequence if wrong | Human review: required or optional? |
|---|---|---|---|
| Step 1: Analyze the denial | Your answer | Your answer | Your answer |
| Step 2: Match clinical evidence | Your answer | Your answer | Your answer |
| Step 3: Draft the appeal letter | Your answer | Your answer | Your answer |
| Final submission to insurer | Your answer | Your answer | Your answer |
Real agents don't just act — they verify their own work before surfacing it to humans. Add a fourth step to your agent that reviews the appeal letter before it is presented for approval.
# STEP 4: Self-check — the agent reviews its own output print("Step 4: Agent self-check...") self_check = call_claude( f"""You are a senior medical billing reviewer. An AI agent has drafted this insurance appeal letter: {appeal_letter} The original denial was: {denial} The patient's clinical record contains: {patient_context} Review the appeal critically: 1. Does every factual claim in the letter match the clinical record exactly? 2. Does the letter address ALL the specific criteria in the denial, or does it miss any? 3. Is there any language that could weaken the appeal or give the insurer grounds to re-deny? 4. Rate the appeal: READY TO SUBMIT / NEEDS MINOR REVISION / NEEDS MAJOR REVISION Be specific about any issues found.""", system="You are a senior medical billing reviewer. Be critical and precise." ) print("=== AGENT SELF-CHECK ===") print(self_check) # In a real system, the agent would route the letter differently # based on the self-check result before presenting to a human if "READY TO SUBMIT" in self_check.upper(): print("\n→ Agent confidence: HIGH — presenting to administrator for final review") elif "MINOR" in self_check.upper(): print("\n→ Agent confidence: MODERATE — flagging issues for administrator before review") else: print("\n→ Agent confidence: LOW — escalating to senior billing specialist")