Dingo icon

Dingo

142

Detects data quality issues in datasets automatically using built-in rules and model evaluation methods.

About

Dingo is a comprehensive data quality evaluation tool that helps you automatically detect data quality issues in your datasets. It provides a variety of built-in rules and model evaluation methods and also supports custom evaluation methods. Dingo supports commonly used text datasets and multimodal datasets, including pre-training, fine-tuning, and evaluation datasets. It offers multiple usage methods, including local CLI and SDK, making it easy to integrate into various evaluation platforms.

Key Features

  • Provides built-in rules for data quality checks
  • Supports LLM-based evaluation using models like OpenAI and Llama3
  • Offers CLI and SDK for flexible integration
  • Supports text and image data modalities
  • Provides a GUI for visualizing evaluation results
  • 142 GitHub stars

Use Cases

  • Evaluating the quality of pre-training datasets
  • Assessing the quality of fine-tuning datasets
  • Identifying data quality issues in text and image datasets
Craft Better Prompts with AnyPrompt
Sponsored
    Dingo: Automated Data Quality Evaluation Tool