What is the European Nucleotide Archive (ENA)?

The ENA is a comprehensive public repository for nucleotide sequence data and associated metadata, managed by the European Bioinformatics Institute (EMBL-EBI).

Does the ENA API have rate limits?

Yes, ENA APIs generally have a rate limit of 50 requests per second. This skill includes best practices for handling rate limiting, such as implementing exponential backoff.

How do I search for data from a specific organism?

You can use the ENA Taxonomy API or the Portal API with taxonomic filters (like tax_tree) to find all records associated with a specific taxon ID or lineage.

Can I download raw sequencing reads like FASTQ files using this skill?

Yes, this skill provides the implementation patterns and API endpoints required to retrieve raw read data and genome assemblies via API, FTP, or Aspera.

What data formats are supported for sequence retrieval?

ENA supports various formats including XML and JSON for metadata, and FASTQ, FASTA, BAM/CRAM, and EMBL flat files for sequence data.

ENA Bioinformatics Database

Name: ENA Bioinformatics Database
Author: x-cmd

byx-cmd

•

데이터 과학 및 ML

Accesses and retrieves nucleotide sequence data, raw reads, and genome assemblies from the European Nucleotide Archive.

The ena-database skill provides Claude with the specialized knowledge required to interact with the European Nucleotide Archive (ENA), a premier global repository for DNA and RNA sequence data. It facilitates programmatic access to biological records, including raw sequencing runs, study metadata, and taxonomic information via REST APIs and FTP. This skill is essential for bioinformatics workflows, allowing researchers and developers to automate data retrieval, query metadata with complex filters, and integrate massive genomics datasets into analysis pipelines while adhering to institutional best practices and rate limits.

주요 기능

01Programmatic retrieval of raw sequencing reads (FASTQ) and genome assemblies.

028 GitHub stars

03Integration with ENA Taxonomy for organism-specific sequence lookups.

04Advanced metadata searching using the ENA Portal and Browser APIs.

05Support for multiple data formats including XML, JSON, FASTA, and CRAM.

06Hierarchical data navigation across Studies, Samples, Experiments, and Runs.

사용 사례

01Building bioinformatics pipelines that require cross-referencing ENA data with external databases.

02Automating the download of sequencing data for specific study accessions.

03Searching for genome assemblies based on taxonomic lineage or metadata criteria.

주요 기능

01Programmatic retrieval of raw sequencing reads (FASTQ) and genome assemblies.

028 GitHub stars

03Integration with ENA Taxonomy for organism-specific sequence lookups.

04Advanced metadata searching using the ENA Portal and Browser APIs.

05Support for multiple data formats including XML, JSON, FASTA, and CRAM.

06Hierarchical data navigation across Studies, Samples, Experiments, and Runs.

사용 사례

01Building bioinformatics pipelines that require cross-referencing ENA data with external databases.

02Automating the download of sequencing data for specific study accessions.

03Searching for genome assemblies based on taxonomic lineage or metadata criteria.