소개
This skill provides a comprehensive framework for debugging TensorFlow applications, helping developers identify and fix common machine learning pitfalls such as tensor shape mismatches, GPU configuration errors, and numerical instabilities like NaN or Inf values. It leverages advanced tools including tf.debugging assertions, TensorBoard profiling, and memory management strategies to optimize data pipelines and model training. Whether you are facing Out-of-Memory (OOM) errors, vanishing gradients, or SavedModel version conflicts, this skill guides you through reproduction, isolation, and resolution phases to ensure robust model performance.