Manages biological datasets with lineage tracking, ontology validation, and FAIR data principles for reproducible research.
LaminDB is a comprehensive data framework designed for biology that transforms messy datasets into queryable, traceable, and FAIR (Findable, Accessible, Interoperable, and Reusable) resources. This skill equips Claude with the expertise to help researchers build biological data lakehouses, manage complex single-cell and spatial transcriptomics workflows, and ensure data integrity through integration with biological ontologies like Bionty. It facilitates seamless lineage tracking from raw sequencing data to final results, integrating directly with workflow managers like Nextflow and MLOps platforms like Weights & Biases to create a robust, auditable research environment.
주요 기능
01Deep integration with bioinformatics pipelines including Nextflow and Snakemake
02Standardized annotation using biological ontologies for genes, cell types, and diseases
03Unified querying capabilities across local and cloud-based biological lakehouses
041 GitHub stars
05Validation and curation workflows for single-cell (AnnData) and multi-modal datasets
06Automatic lineage tracking for data, code, and computational environments
사용 사례
01Tracking data provenance and versioning in multi-step computational genomics workflows
02Building a searchable, multi-user repository for heterogeneous biological research data
03Curating and validating scRNA-seq datasets against standardized cell type ontologies