Package index • faSTM

Fitting a model

Fit a structural topic model and build the prevalence design.

stm(): Fit a structural topic model (fast Rust backend, stm-compatible object)
s(): Spline term for prevalence formulas
makeDesignMatrix(): Build a (sparse) design matrix for new data (stm-compatible)

Honest effect estimation (method of composition) with weights, cluster-robust SEs, and random effects; marginal effects and effect plots.

estimateEffect(): Estimate covariate effects on topic prevalence (method of composition)
ame(): Average marginal effects from an estimateEffect fit
effect_estimates(): Extract estimateEffect estimates as a tidy data.frame (no plotting)
posterior_theta_samples(): Draw from the per-document topic-proportion posterior
plot(<faSTM_effect>): Plot estimated covariate effects on topic prevalence

Labels, representative documents, FREX, and topic correlations.

label_topics(): Label topics by top words (prob, FREX, lift, score)
sage_labels(): Labels for a content (SAGE) model
find_thoughts(): Representative documents for each topic
find_topic(): Find topics whose top words include given words
topic_terms(): Top terms per topic, with their numeric scores (tidy)
topic_proportions(): Expected topic proportions (the numbers behind the summary plot)
content_topics(): Marginal content words by one content covariate
frex_scores(): FREX scores for every word and topic
topic_correlation(): Topic correlation graph (positive correlations of topic proportions)
topic_corr_graph(): Topic-correlation network as an igraph graph
plot(<faSTM>): Plot a fitted model
plot_topic_network(): Topic correlation network

Semantic coherence (Mimno / NPMI / C_V), exclusivity, diagnostics.

Held-out evaluation and model selection across K.

search_k(): Search over the number of topics K
select_model(): Fit several models and keep the ones on the quality frontier
select_best(): Pick one model from a select_model run
many_topics(): Select models across a range of K
multi_stm(): Cross-run topic stability
make_heldout(): Create a held-out version of a corpus for document-completion validation
eval_heldout(): Evaluate held-out log-likelihood of a fit on a held-out set
permutation_test(): Permutation test for a binary covariate's effect on topics
topic_lasso(): Predict a document-level outcome from topic proportions (lasso)
plot(<faSTM_searchk>): Plot search_k diagnostics
as.data.frame(<faSTM_searchk>): Convert search_k diagnostics to long form for plotting

Infer topic proportions for new documents.

tidy(<faSTM>): Tidy a faSTM fit (topic-term or document-topic distributions)
tidy(<faSTM_effect>): Tidy an estimateEffect fit (one row per term per topic)
glance(<faSTM>): One-row model summary for a faSTM fit
augment(<faSTM>): Augment: most-likely topic for each document-term token
reexports tidy glance augment: Objects exported from other packages

Read prepared text from quanteda / tidytext and convert corpora.

as_corpus(): Build a faSTM corpus from prepared text
align_corpus(): Align a new corpus to a fitted model's vocabulary
from_tidy(): Build a faSTM corpus from a tidy (long) term-count table
make_dt(): Document-topic proportions as a data frame
read_ldac() write_ldac(): Read/write a corpus in LDA-C (Blei) sparse format

Aliases that keep stm-style call sites working unmodified.

alignCorpus(): Align a new corpus to a reference vocabulary (stm-compatible)
asSTMCorpus(): Coerce inputs into an stm-style corpus (stm-compatible)
convertCorpus(): Convert documents/vocab between corpus formats (stm-compatible)
fitNewDocuments(): Infer topics for new documents (stm-compatible signature)
checkBeta(): Flag words that load almost entirely on one topic