Accesses and manages large-scale public cancer imaging datasets from the NCI Imaging Data Commons using the idc-index tool.
This skill enables researchers and developers to seamlessly query, visualize, and download public cancer imaging data (CT, MR, PET, and pathology) from the National Cancer Institute's Imaging Data Commons (IDC). By leveraging the idc-index Python package, users can filter massive datasets by metadata, anatomical site, or modality without requiring authentication. It simplifies the discovery of high-quality DICOM data for AI model training, clinical research, and medical imaging analysis, providing direct access to metadata tables, clinical data, and public cloud storage locations.
主要功能
01Download DICOM images directly from public cloud storage (S3/GCS) without authentication
02Query extensive metadata using SQL for radiology and pathology datasets
03Access structured clinical data and join it with imaging metadata for complex analysis
04Visualize medical imaging series in-browser via generated viewer URLs
051 GitHub stars
06Filter data by modality, cancer type, anatomical site, and expert-curated analysis results
使用场景
01Building large-scale AI training datasets for cancer detection using public DICOM images
02Conducting clinical research by joining patient metadata with radiology and pathology scans
03Validating AI-generated segmentations against expert-curated ground truth from IDC collections