01Reverse KL Divergence (MiniLLM) for superior generative distillation
023,983 GitHub stars
03Logit and response-based distillation strategy implementations
04Temperature scaling for softening probability distributions
05Deep integration with Hugging Face Transformers and PyTorch
06Multi-teacher ensemble distillation support