소개
This skill provides a comprehensive framework for red teaming machine learning training pipelines by simulating advanced adversarial attacks. It enables security analysts to evaluate how models respond to label flipping, backdoor triggers, and sophisticated clean-label attacks that are often undetectable by standard validation. By integrating this skill into the development lifecycle, teams can identify critical weaknesses in data ingestion, fine-tuning, and RLHF processes, ultimately mapping vulnerabilities to industry-standard frameworks like OWASP LLM and MITRE ATLAS.