Guides the specification, validation, and reporting of Structural Topic Models (STM) for survey and experimental text data.
This skill provides a rigorous methodological framework for performing topic modeling with a focus on social science standards. It assists users in selecting between Structural Topic Models (STM), LDA, and BERTopic, while providing deep technical guidance on preprocessing text, selecting the optimal number of topics using diagnostic metrics, and estimating the effects of metadata covariates on topic prevalence. Whether you are analyzing open-ended survey responses or experimental corpora, this skill ensures your analysis is reproducible, validated against treatment groups, and reported according to academic best practices.
Key Features
0115 GitHub stars
02Methodological guidance on text preprocessing and frequency thresholds
03Standardized reporting templates for DA-RT compliant research
04Validation techniques including permutation tests and FREX word analysis
05Multi-metric diagnostic evaluation for selecting topic counts (K)
06Structural Topic Model (STM) specification with metadata covariates
Use Cases
01Analyzing open-ended survey responses with respondent demographic metadata
02Developing reproducible text analysis pipelines for academic publications
03Estimating the effect of experimental treatments on text discussion topics