About
This skill offers a comprehensive 'first look' at SAE features by calculating activation statistics, PageRank-weighted token influences, and ability family breakdowns. Designed for mechanistic interpretability workflows, it helps researchers distinguish between true core concepts and 'flanderized' super-stimuli by analyzing specific activation regions like the Floor, Core, and High zones. It serves as an essential starting point for investigating how models represent domain-specific data—such as Splatoon NLP data—providing the statistical foundation needed before moving to deeper causal experiments.