Lesson 13 — Ethics, Fairness & Responsible AI

(privacy, bias, explainability, and “should we deploy?” thinking)

Why this matters (motivation)¶

Analytics and AI are not only technical tools—they influence people. Even “simple” models can affect:

credit access,
hiring and promotion,
pricing and eligibility,
education and opportunities.

Responsible AI means:

using data legally and ethically,
protecting privacy,
avoiding unfair harms,
and communicating limitations honestly.

Part A — What “responsible AI” means in this course¶

Two useful frameworks (light use)¶

OECD AI Principles (high-level principles)
NIST AI Risk Management Framework (risk-based approach)

We use these as checklists, not as heavy policy documents.

Part B — Privacy and data governance (practical)¶

What counts as sensitive in typical student projects?¶

Examples:

personal identifiers (names, phone numbers, precise addresses)
health information
detailed location traces
any data that can re-identify individuals
high-risk attributes tied to discrimination concerns

In this course:

prefer public datasets (e.g., OWID) or anonymized teaching datasets
if using survey/interview text, remove personal identifiers

Privacy basics to communicate in a report¶

where the data came from
what personal data is included (if any)
how it was anonymized/aggregated
who has access
how long it is stored

Part C — Fairness and bias: where problems enter¶

Bias can enter through multiple channels:

1) Target label problem¶

What is the label measuring?

“loan default” may reflect economic shocks and unequal opportunity, not only “responsibility”
“employee performance” can be influenced by manager bias

2) Feature and proxy problem¶

Some features act as proxies for sensitive attributes:

zip code, language, school attended, device type, etc.

3) Sampling problem¶

Your dataset may not represent the population:

missing rural groups
missing small firms
platform users only

4) Measurement problem¶

Errors may differ across groups:

underreporting in some communities
inconsistent definitions across countries

5) Objective problem¶

Optimizing overall accuracy can produce unequal harm. Example: a model may reduce errors for majority groups while increasing errors for minority groups.

Part D — Minimal fairness checks (what we can do at this level)¶

We keep fairness checks simple and transparent.

Step 1: Choose a grouping variable (if available and appropriate)¶

Examples:

region
age bracket
income bracket
firm size category
(avoid sensitive attributes unless clearly justified and ethically handled)

Step 2: Compare performance by group¶

For classification (e.g., churn yes/no), compare:

accuracy by group
false positive rate by group
false negative rate by group

For continuous outcomes (regression/forecasting), compare:

MAE or RMSE by group
systematic under/over-prediction by group

Step 3: Interpret carefully¶

A difference does not automatically prove discrimination. But it is a red flag that requires explanation and monitoring.

Part E — Explainability: what it is and what it isn’t¶

Many real stakeholders ask: “Why did the model make this prediction?”

Global vs local explainability (simple distinction)¶

Global: what features matter overall? (feature importance)
Local: why for this particular case? (SHAP/LIME style explanations)

Practical role in your capstone¶

You may use explainability to:

identify surprising predictors,
detect potential proxy features,
explain predictions in a report.

But always include a caveat:

“This explains the model, not the true causal mechanism.”

Part F — “Should we deploy?” (a decision rubric)¶

Even if you are not actually deploying systems, this is a useful professional habit.

Mini case discussion (in class)¶

Choose one case:

Credit scoring
Hiring screening
Pricing / eligibility
Student risk prediction (education analytics)
Platform moderation or fraud detection

For the chosen case:

identify risks in data, labels, features
propose minimal fairness checks
propose transparency statements

Mini-lab (Google Colab)¶

In-class checkpoints (privacy & documentation)¶

Write a short “Data Governance Note” for your capstone dataset:
- source, access date, unit of observation
- what personal data exists (if any)
- storage/access plan
- limitations

In-class checkpoints (fairness checks)¶

Choose a grouping variable (if available).
Compute and compare at least one metric by group:
- classification: accuracy + FN rate (or FP rate)
- forecasting/regression: MAE by group or by time period
Write 5–7 lines interpreting results and stating one risk.

In-class checkpoints (explainability — concept + demo)¶

Use a simple explainability tool (or a simplified feature importance):
- identify top features driving predictions
Identify at least one proxy risk or surprising feature, and describe what you would investigate next.

In-class checkpoints (capstone checklist)¶

Draft your Responsible AI Checklist for the capstone (see below).

Submission (after class)

Colab link (view permission) or PDF export.
Include your Responsible AI Checklist and a short ethics note.

Capstone Responsible AI Checklist (required)¶

AI check (meta)¶

Review questions (quiz / reflection)¶

Give two ways bias can enter a model pipeline.
Why can an accurate model still be harmful?
What is the difference between explaining the model and explaining the real world?
Name one fairness check you can do at a basic level in this course.