Lesson 11 — Scientific Research with AI in Social Sciences
(literature mapping, evidence synthesis, qualitative coding, and Capstone Proposal v1)
Why this matters (motivation)¶
AI tools can speed up:
finding papers,
organizing themes,
drafting summaries,
and coding text.
But social-science research is not “speed-writing.” Quality comes from:
careful question design,
transparent evidence handling,
and critical interpretation.
This week also connects to what you have learned so far: you now have multiple tool families (EDA/visualization, probability/simulation, regression, clustering/PCA, trees, text signals, and forecasting). The research challenge is choosing the right tool for the question—and justifying that choice with evidence and transparency.
A research workflow (the course version)¶
Today we focus on Steps 1–3 and practice a small piece of qualitative coding.
Part A — Research questions: from topic to question¶
Topic vs question¶
Topic: “Remote work”
Research question: “How is remote work associated with self-reported productivity and job satisfaction across industries?”
A good question has:
a clear population/unit (who? what?)
a clear outcome (what is measured?)
a feasible strategy (what data or evidence?)
A quick “question tightening” template¶
Population / setting: (e.g., Thai firms, ASEAN consumers, university students)
Outcome: (e.g., sales growth, job satisfaction, adoption probability, inflation)
Mechanism or channel (optional): (e.g., flexibility, monitoring, cost savings)
Comparison: (e.g., before/after, high/low exposure, group comparisons)
Feasibility: data available? time constraints? permissions?
Part B — Literature mapping with AI (transparent, not magical)¶
What literature mapping is (and is not)¶
It is not: “ask AI for 10 papers and copy the summary.”
It is: a structured method to discover themes, debates, and gaps.
Recommended tools (choose your level)¶
Semantic Scholar (search + metadata; API optional)
Connected Papers (visual map; optional)
Zotero (organize citations; optional)
AI usage: drafting search strings, summarizing abstracts, proposing a theme taxonomy — but you must verify.
Part C — Evidence synthesis: writing “what we know”¶
A good synthesis does more than list papers. It organizes evidence by:
methods (survey, experiment, panel regression, text analysis, forecasting),
context (country, industry),
findings (consistent vs mixed),
limitations (data, measurement, identification),
and open questions.
Part D — Qualitative coding with AI (and why we compare to manual coding)¶
Why qualitative coding matters¶
Interviews, open-ended survey responses, policy documents, and news text are common evidence sources.
Coding turns text into:
themes,
categories,
structured analysis.
AI-assisted coding: promise and risk¶
Promise:
speed,
consistency,
quick theme suggestions.
Risks:
hallucinated themes,
overconfident labels,
bias/stereotypes,
missing nuance (sarcasm, context).
A simple coding exercise (what we do today)¶
We code a small set of quotes using:
manual coding (small groups)
AI-assisted coding (structured prompt) Then we compare:
agreements/disagreements,
where AI overgeneralized,
where humans disagreed (coding ambiguity).
Mini-lab (Google Colab)¶
In-class checkpoints (Literature mapping)¶
Choose a research question (provided list or your own).
Create a search query list (5–10 keywords/phrases).
Collect 10–15 candidate papers:
title + year + venue/source
abstract (if available)
Screen down to 6–8 “most relevant” papers using explicit criteria.
Group the final set into 2–4 themes and write a 150–200 word map.
In-class checkpoints (Qualitative coding)¶
Use a small set of interview/open-ended responses (provided).
Define a codebook with 4–6 codes (short definitions).
Code the same set manually (group work).
Run AI-assisted coding using your codebook and compare results.
Write 5–7 lines:
where AI matched humans,
where it differed,
what you would do next to improve reliability.
Capstone Proposal v1 (due after class)¶
Submission format: PDF or Markdown export via LMS. Include a short prompt/workflow log if AI tools were used.
AI check (responsible use for research)¶
Good prompt examples
“Propose search terms for studies on remote work and job satisfaction across industries.”
“Given these abstracts, group them into themes and explain your grouping logic.”
“Here is my codebook. Apply these codes to the following quotes and flag ambiguous cases.”
Bad prompt examples
“Write my literature review with citations” (risk of fake citations and shallow synthesis)
“Decide the themes and final conclusions for me” (outsourcing research judgment)
Review questions (quiz / reflection)¶
What makes a research question “answerable” rather than just a broad topic?
What are two risks of AI-assisted literature summaries?
Why do we compare AI coding to manual coding?
What is one transparency practice you will use in your capstone?