HPC/AI Cluster Tool icon

HPC/AI Cluster Tool

Queries Kubernetes to retrieve GPU node information and cluster topologies for Azure HPC/AI clusters.

소개

This minimal Model Context Protocol (MCP) server is specifically engineered for Azure HPC/AI clusters, providing essential tools for Kubernetes introspection. It enables users to query cluster nodes for detailed GPU information, including their status and labels, as well as crucial InfiniBand-related topology data like agentpool, pkey, and torset. Built with `fastmcp` and utilizing synchronous `subprocess` calls for `kubectl`, it offers a straightforward approach to accessing vital cluster insights. A helper script is also included for local, in-process execution and testing of its functionalities.

주요 기능

  • List GPU nodes with their names, labels, and Ready/NotReady status.
  • 0 GitHub stars
  • Retrieve InfiniBand-related topology labels (agentpool, pkey, torset) for each node.
  • Execute server tools locally for in-process testing and development without an MCP host.

사용 사례

  • Monitor GPU node status and health within Azure HPC/AI environments.
  • Gather InfiniBand topology details for network configuration, optimization, or troubleshooting.
  • Develop and test Kubernetes introspection tools for HPC/AI clusters.
Advertisement

Advertisement