What input data does this skill require?

It requires a count matrix (samples vs. genes) containing non-negative integers and a corresponding metadata DataFrame containing experimental factors and conditions.

How does this handle multiple testing correction?

PyDESeq2 automatically performs Benjamini-Hochberg (FDR) correction on raw p-values, providing an adjusted p-value (padj) column in the results.

Does it support log-fold change (LFC) shrinkage?

Yes, the skill supports optional apeGLM shrinkage, which is highly recommended for visualization and ranking genes by their effect size.

Can I use this skill to control for batch effects?

Yes, you can specify multi-factor designs in the design formula (e.g., '~batch + condition') to account for technical variation or other covariates.

Is this skill compatible with AnnData objects?

Yes, the skill includes patterns for extracting count matrices and metadata from AnnData objects, making it compatible with Scanpy-based workflows.

PyDESeq2 Bioinformatics Analysis

Name: PyDESeq2 Bioinformatics Analysis
Author: Zehong-Wang

byZehong-Wang

0•

数据科学与机器学习

Performs differential gene expression analysis on bulk RNA-seq data using the DESeq2 framework within Python.

This skill integrates the PyDESeq2 library into Claude to streamline the identification of differentially expressed genes from transcriptomic count matrices. It facilitates the entire bioinformatics workflow, including data normalization, dispersion estimation, statistical testing with Wald tests, and FDR correction. Ideal for researchers transitioning from R to Python or those building integrated genomic pipelines, the skill supports complex experimental designs, batch effect correction, and high-quality visualizations like volcano and MA plots. By providing standardized implementation patterns, it ensures robust and reproducible statistical analysis of genomic data.

主要功能

01Statistical testing via Wald tests with Benjamini-Hochberg FDR correction

02Built-in visualization support for publication-ready Volcano and MA plots

03Automated differential expression workflows for bulk RNA-seq counts

040 GitHub stars

05Support for complex multi-factor experimental designs and batch effect correction

06Optional apeGLM shrinkage for improved log-fold change estimation and ranking

使用场景

01Converting R-based DESeq2 workflows into Python-integrated data science pipelines

02Identifying differentially expressed genes between treated and control groups

03Adjusting for technical batch effects and covariates in large-scale transcriptomic studies

主要功能

01Statistical testing via Wald tests with Benjamini-Hochberg FDR correction

02Built-in visualization support for publication-ready Volcano and MA plots

03Automated differential expression workflows for bulk RNA-seq counts

040 GitHub stars

05Support for complex multi-factor experimental designs and batch effect correction

06Optional apeGLM shrinkage for improved log-fold change estimation and ranking

使用场景

01Converting R-based DESeq2 workflows into Python-integrated data science pipelines

02Identifying differentially expressed genes between treated and control groups

03Adjusting for technical batch effects and covariates in large-scale transcriptomic studies