Lesson 12 — AI Literacy: Prompting, Verification, Hallucinations, and Prompt Logs

(from “asking questions” to reliable workflows; a taste of agentic loops)

Why this matters (motivation)¶

By now you can:

clean data,
run regressions and clustering,
build trees and text signals,
and forecast with baselines.

Generative AI can help you do these tasks faster. But it can also:

produce plausible but wrong explanations,
invent references,
write code that runs but answers the wrong question,
or encourage overconfident conclusions.

This week makes sure you can use AI tools responsibly and effectively—in a way that faculty (and future employers) will accept.

Part A — What “AI literacy” means in this course¶

What AI is good at (in our workflow)¶

drafting code scaffolds
suggesting EDA checks and chart types
summarizing text (when you cross-check)
proposing alternative explanations or sensitivity checks
turning messy notes into structured outlines (not final content)

What AI is bad at (common risks)¶

factual reliability without sources
citations (often hallucinated)
domain-specific judgment
hidden assumptions in code
causal claims without design
overly confident “final answers”

Part B — Prompting patterns that produce better work¶

Pattern 1: Ask for assumptions first¶

Instead of:

“Analyze this dataset.”

Try:

“What assumptions do you need to make to analyze this dataset responsibly? List them and suggest checks.”

Pattern 2: Ask for a plan, not a final product¶

“Propose a step-by-step workflow (EDA → cleaning → model → diagnostics → communication) for this question.”

Pattern 3: Ask for checks and failure modes¶

“List the top 5 ways this analysis could be misleading and how to test each.”

Pattern 4: Ask for alternatives¶

“Give two alternative model specifications and explain when each is appropriate.”

Pattern 5: Force careful language¶

“Write conclusions using association language and include one limitation and one potential confounder.”

Part C — Verification: a practical checklist¶

1) Code verification (fast sanity checks)¶

run on a small subset (e.g., first 100 rows)
check shapes and dtypes before/after transformations
compare results with a simpler method (baseline)
print intermediate values (not only final plots)

2) Data verification¶

confirm units (percent vs fraction; dollars vs thousands)
check missingness patterns
check ranges (impossible values)
confirm time ordering (for forecasts)

3) Claim verification¶

is the conclusion consistent with the chart/table?
are you accidentally implying causality?
does the result depend on one outlier or one subgroup?
can you reproduce it from scratch?

Part D — Hallucinations: what they look like in practice¶

Hallucination type 1: Fake citations¶

AI may produce:

plausible author names,
plausible journals,
and plausible titles that do not exist.

Rule: only cite papers you can actually locate and verify.

Hallucination type 2: Wrong-but-plausible code¶

AI code may:

run but use the wrong column,
silently drop missing values,
shuffle time series,
or leak test information into training.

Hallucination type 3: Overconfident interpretation¶

AI often writes strong language:

“X causes Y”
“the model proves”
“this confirms” even when the analysis only supports association.

Part E — Prompt & Workflow Logs (the course standard)¶

Minimum required fields¶

Task context (what you were trying to do)
Prompt(s)
AI output snippet (short)
What you changed (edits)
Verification steps (sanity checks)
Final decision (what you accepted or rejected)

Part F — A “taste” of agentic workflows (human-in-the-loop)¶

Many people use the term “agentic AI” to mean AI systems that:

plan tasks,
execute steps,
check results,
and iterate toward a goal.

In this course, we use a safe version: Plan → Execute → Check → Iterate, where you execute the steps and approve changes.

Mini-lab (Google Colab)¶

In-class checkpoints (prompting)¶

Choose one task from your capstone workflow:
- EDA checklist for your dataset
- cleaning plan
- regression specification
- clustering plan
- forecasting baseline plan
Write a “bad prompt” and a “good prompt” for the same task.
Compare outputs and explain why the “good prompt” is better.

In-class checkpoints (verification)¶

Take one AI-generated code snippet and run it.
Perform at least 3 verification checks:
- check shapes/dtypes
- check missingness changes
- check basic summary stats
- compare to a baseline result
Identify at least one failure or risk (even if minor) and fix it.

In-class checkpoints (hallucination hunt)¶

You will be given a short AI-generated paragraph (or code) that contains 2–3 issues.
Find and label issues (fake citation? wrong claim? leakage? wrong column?).
Rewrite the paragraph with correct language and at least one caveat.

In-class checkpoints (agent-like loop)¶

Run one full loop:

AI proposes a plan
you execute in Colab
AI reviews
you implement one improvement

Submission (after class)

Colab link (view permission) or PDF export.
Include a prompt/workflow log with your prompts, edits, and verification steps.

AI check (meta)¶

Review questions (quiz / reflection)¶

Name three common AI failure modes in data analysis.
What is one verification check you should always do after cleaning data?
Why is it risky to ask AI for citations?
What is the safe “agent-like loop” used in this course?