Infers gene regulatory networks from transcriptomics data using scalable algorithms like GRNBoost2 and GENIE3.
Arboreto is a high-performance computational library designed to reconstruct gene regulatory networks (GRNs) from large-scale gene expression datasets, such as bulk or single-cell RNA-seq. By leveraging parallelized algorithms including GRNBoost2 (gradient boosting) and GENIE3 (random forest), it identifies relationships between transcription factors and target genes across diverse biological conditions. This skill is essential for bioinformaticians needing to scale complex regulatory analyses from local machines to multi-node Dask clusters while maintaining high accuracy and computational efficiency.
Key Features
01Scalable GRN inference using GRNBoost2 and GENIE3 algorithms
020 GitHub stars
03Seamless integration with the pySCENIC pipeline for single-cell analysis
04Identification of transcription factor (TF) to target gene relationships
05Support for both bulk and single-cell RNA-seq expression matrices
06Distributed computing support via Dask for large-scale datasets
Use Cases
01Analyzing gene expression patterns to find transcription factor targets in bulk RNA-seq
02Performing comparative regulatory network analysis across multiple experimental conditions
03Identifying cell-type-specific regulatory interactions in single-cell RNA-seq data