Geniml is a comprehensive toolkit for building machine learning models on genomic interval data, such as BED files. It offers unsupervised methods for learning low-dimensional embeddings of genomic regions, single cells, and metadata labels, enabling powerful similarity searches, clustering, and downstream analysis. Whether building consensus peak sets, analyzing single-cell chromatin accessibility, or performing cross-modal queries between experimental conditions and genomic features, this skill provides the implementation patterns and workflows necessary for advanced genomic feature learning.
主な機能
01Dimensionality reduction and clustering for single-cell ATAC-seq data
02Generation of unsupervised genomic region embeddings via Region2Vec
032,066 GitHub stars
04Joint representation learning for genomic regions and metadata labels
05Genomic data randomization and tokenization utilities
06Statistical methods for building reference peak universes