Can I use float64 with this MLX skill?

The skill identifies that float64 is CPU-only in MLX and will suggest using float32 or bfloat16 to ensure your operations can run on the Metal GPU.

Does it support LLM optimization?

Yes, it provides guidance on using 4-bit quantization and bfloat16 to maximize memory bandwidth for LLM inference on Apple Silicon.

Does this skill help with PyTorch to MLX migration?

Yes, it includes specific mapping for PyTorch equivalents, including NCHW to NHWC format conversion, layer overrides, and weight conversion patterns.

Why is lazy evaluation important in MLX?

MLX builds a computation graph that only executes when mx.eval() is called. This skill helps you place these calls correctly to avoid massive overhead or unnecessary computation.

Apple MLX Development

Name: Apple MLX Development
Author: luqmannurhakimbazman

byluqmannurhakimbazman

0•

Data Science & ML

Writes idiomatic Apple MLX code for high-performance machine learning optimized for Apple Silicon hardware.

The Apple MLX Development skill provides specialized guidance for building and optimizing machine learning models natively on Apple Silicon. It helps developers navigate the critical differences between MLX and frameworks like PyTorch or NumPy, specifically regarding lazy evaluation, unified memory management, and Metal GPU acceleration. By enforcing best practices for array indexing, NHWC formats, and graph compilation, this skill ensures that ML workloads achieve peak performance on Mac hardware while avoiding common pitfalls like redundant evaluations or CPU-only data type bottlenecks.

Key Features

01Translates PyTorch and NumPy patterns into idiomatic MLX implementations

020 GitHub stars

03Configures neural network modules using NHWC format and native __call__ patterns

04Optimizes lazy evaluation by managing mx.eval() at efficient loop boundaries

05Implements state-aware graph compilation using mx.compile for maximum throughput

06Manages Apple Silicon unified memory and Metal GPU stream allocation

Use Cases

01Developing custom neural network layers optimized for Metal GPU acceleration

02Building high-performance LLM inference applications with 4-bit quantization

03Porting existing PyTorch models to run natively and efficiently on Apple Silicon

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add luqmannurhakimbazman/ashford mlx-dev

For use in Claude.ai and ChatGPT

Download Skill