Accesses and analyzes functional genomics data from the NCBI Gene Expression Omnibus (GEO) repository.
The GEO Database skill provides specialized capabilities for interacting with the NCBI Gene Expression Omnibus, the primary public repository for high-throughput gene expression and functional genomics data. It streamlines the complex process of searching for studies, retrieving metadata, and downloading experimental results across more than 8 million samples. By integrating tools like GEOparse and NCBI E-utilities, this skill enables researchers and data scientists to programmatically handle Series (GSE), Samples (GSM), and Platforms (GPL), facilitating efficient transcriptomics workflows and meta-analyses directly within a development environment.
主要功能
01Direct FTP access and batch downloading of large genomic supplementary files
02Advanced dataset searching using NCBI E-utilities and Entrez keywords
03Support for both microarray and RNA-seq data formats (SOFT/Matrix files)
04Automated data retrieval and parsing via the GEOparse library
05Hierarchical navigation of GSE, GSM, GPL, and GDS accession types
060 GitHub stars
使用场景
01Building expression matrices from multiple GEO Series for cross-study meta-analysis
02Extracting platform annotations to map genomic probes to gene symbols and identifiers
03Automating the collection of RNA-seq datasets for specific organisms or disease states