How does MLX differ from PyTorch in Claude Code?

MLX uses lazy evaluation, NHWC format for convolutions, and a unified memory architecture. This skill helps Claude navigate these differences, such as overriding __call__ instead of forward().

When should I use mx.eval() according to this skill?

The skill enforces evaluating at iteration boundaries rather than inside loops to avoid massive overhead, ensuring computation only triggers when necessary.

How does array indexing work differently in MLX?

Unlike NumPy, MLX slices create copies rather than views, and lists used for indexing must be explicitly converted to mx.array objects.

Does MLX support float64 on the GPU?

No, float64 is CPU-only in MLX. The skill identifies these instances and suggests using float32 or float16 for Metal GPU compatibility.

Can I use this skill to optimize LLM training on Mac?

Yes, the skill provides patterns for mx.compile, 4-bit quantization, and bfloat16 usage which are critical for efficient LLM training and inference on Apple Silicon.

Apple MLX Development

Name: Apple MLX Development
Author: tkwn2080

bytkwn2080

•

数据科学与机器学习

Develops high-performance, idiomatic machine learning code for Apple Silicon using the MLX framework.

The mlx-dev skill empowers developers to build and optimize machine learning models specifically for Apple Silicon using the MLX framework. It provides expert guidance on MLX-specific patterns such as lazy evaluation, unified memory management, and Metal GPU acceleration. This skill is essential for migrating PyTorch or NumPy projects to MLX, ensuring idiomatic implementation of neural networks, and avoiding common pitfalls like unnecessary evaluations, incorrect array indexing, or CPU-only float64 operations.

主要功能

01Correct implementation of NHWC formats and neural network modules

02Idiomatic MLX code generation following Apple Silicon best practices

03Optimization for lazy evaluation and precise mx.eval placement

04Seamless migration guidance from PyTorch and NumPy architectures

05Performance tuning for Unified Memory and mx.compile state management

063 GitHub stars

使用场景

01Optimizing training loops with JIT compilation and memory-efficient indexing

02Implementing low-latency LLM inference and quantization on M-series chips

03Converting PyTorch model architectures to native MLX for peak Mac performance

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add tkwn2080/mlx-dev-skill mlx-dev

For use in Claude.ai and ChatGPT

Download Skill