Codev-Bench icon

Codev-Bench

Evaluates code completion tools' ability to accurately capture a developer's intent and suggest appropriate code snippets in diverse contexts.

About

Codev-Bench is a comprehensive evaluation framework designed to assess the performance of code completion tools in real-world, repository-level, and developer-centric scenarios. It moves beyond traditional benchmarks that focus solely on function generation from comments by incorporating diverse sub-scenes encountered in daily IDE-based coding, such as contextual completion for logical blocks, function parameter lists, and ordinary statements. Using unit tests and AST parsing, Codev-Bench accurately evaluates the code quality generated by various Language Learning Models (LLMs) across a range of completion scenarios, including full block, incomplete suffix, inner block, and Retrieval-Augmented Generation (RAG)-based completion.

Key Features

  • Fine-grained evaluation of code completion tools.
  • Real-world, repository-level benchmark.
  • Developer-centric scenarios.
  • Unit test-based evaluation.
  • Supports diverse completion sub-scenes
  • 41 GitHub stars

Use Cases

  • Developing and improving code completion models to better align with developer needs.
  • Evaluating the performance of code completion models.
  • Identifying strengths and weaknesses of code completion tools in different scenarios.
Craft Better Prompts with AnyPrompt
Sponsored