Processes genomic datasets including sequence alignments, variants, and reference sequences using a Pythonic interface to htslib.
Pysam is a specialized Python module designed for bioinformaticians and researchers to efficiently read, manipulate, and write high-throughput sequencing data. This skill enables Claude to handle complex genomic file formats like SAM/BAM/CRAM, VCF/BCF, and FASTA/FASTQ, providing implementation patterns for region-based queries, pileup analysis, and coverage calculation. It is an essential tool for building automated NGS pipelines, performing quality control, or extracting specific genetic information from large-scale biological datasets using htslib-backed performance.
Características Principales
018 GitHub stars
02Extract FASTA/FASTQ sequences with high-performance random access
03Read and write SAM/BAM/CRAM alignment files
04Query and filter VCF/BCF variant files using region-based fetching
05Execute samtools and bcftools commands directly from Python
06Perform pileup analysis for base-by-base coverage calculation