01Integration of dataset versions into experiment and output logging
02Clear separation of raw data from derived/processed artifacts
03Lightweight metadata tracking designed to avoid repository bloat
040 GitHub stars
05Automated creation of dataset manifests in CSV, JSON, or TOML formats
06Source and version tracking with checksum-based integrity verification