3. Validate the topics¶
This is the step that separates a publishable analysis from a fishing expedition. You must show, with evidence, that your topics are (a) coherent to humans, (b) reproducible, and (c) interpretable as substantive themes.
Human validation: intrusion tests¶
The field-standard test (Chang et al. 2009, Reading Tea Leaves) asks whether a human can tell a topic apart from noise. topica builds both variants for you, with an answer key.
Word intrusion: each topic's top words plus one intruder from another topic.
import topica
tests = topica.word_intrusion(model, n_words=5, seed=0)
for t in tests:
print(f"Topic {t['topic']}: {t['words']}")
# answer key for coding:
# intruder '{t['intruder']}' at position {t['intruder_index']}
Show the shuffled words to coders (hide the key); a topic where coders reliably
spot the intruder is coherent.
Document intrusion: a topic's representative documents plus one where the topic is nearly absent. This tests whether the topic captures real document similarity.
Report intrusion accuracy (and, ideally, inter-coder agreement) in your supplementary materials.
Computational validation: coherence and exclusivity¶
Per-topic metrics complement the human tests. Report both together: a good topic is coherent and exclusive.
coherence = model.coherence(10) # UMass, per topic
cv = topica.coherence(model, texts, coherence_type="c_v", topn=10)
exclusivity = topica.exclusivity(model, 10) # per topic
frontier = topica.quality_frontier(model, n=10) # tidy: coherence, exclusivity, prevalence
c_v correlates best with human judgement; UMass is a fast intrinsic check;
NPMI is the normalized middle ground. The coherence×exclusivity scatter is the
canonical STM quality plot: weak topics sit in the lower-left.
Stability: answer the "fishing expedition" critique head-on¶
A topic that dissolves when you perturb the corpus is not a finding. Refit on bootstrap resamples and report which topics are robust:
boot = topica.bootstrap_stability(docs, k=20, n_boot=50, iterations=800)
for t, s in zip(boot["topic"], boot["stability"]):
print(f"Topic {t}: stability {s:.2f}") # mean top-word Jaccard across resamples
print("overall:", boot["mean"])
Also check that the same topics emerge across random seeds, aligning topics between fits and scoring their overlap:
a = topica.LDA(num_topics=20, seed=1); a.fit(docs, iterations=800)
b = topica.LDA(num_topics=20, seed=2); b.fit(docs, iterations=800)
pairs = topica.align_topics(a, b) # one-to-one matching (Hungarian)
print("stability across seeds:", topica.topic_stability([a, b], topn=10))
Flag fragile topics explicitly. A paper that says "topics 4 and 11 were unstable across resamples and we interpret them cautiously" is more credible, not less.
Interpretation: close reading, not just top words¶
Distant reading (top words) and close reading (the actual documents) are both required. Pull a topic's most representative documents and read them, with the topic's words highlighted in your notebook:
labels = topica.label_topics(model.topic_word, model.vocabulary, n=8) # prob & FREX
html = topica.find_thoughts_html(model, texts, n_docs=3, n_words=8) # highlighted quotes
Use FREX (frequent and exclusive) words, not just the most probable ones, when labeling. They are far more evocative of what makes a topic distinctive.
What topics are NOT
Topics are statistical patterns of word co-occurrence. They are not necessarily coherent concepts, and not necessarily what a document is "about." Write "this document has high probability for Topic 3" and "words associated with Topic 3 suggest…" Never write "the model discovered that…" or "this document is about Topic 3."
Common problems and what to do¶
| Symptom | Likely cause | Fix |
|---|---|---|
| Junk topics (stopwords, numbers) | Under-pruned vocabulary | Add custom stopwords; raise min_doc_freq; rm_top |
| Duplicate topics | K too high, or duplicate documents |
Lower K; dedupe; check align_topics / topic_correlation |
| Uninterpretable topics | Not every topic is meaningful | Document as "Mixed/Other"; consider lower K |
| One dominant topic | Corpus-wide vocabulary | May be legitimate; or treat as a background topic and rm_top |
→ Next: Measure effects properly.