概要
This production-ready machine learning pipeline is designed to score sales leads, predicting the probability of conversion into a customer based on historical CRM data. It encompasses comprehensive Exploratory Data Analysis (EDA) that handles categorical data and missing values while preventing leakage, an automated training pipeline utilizing XGBoost to achieve an impressive ~0.88 AUC score, and a FastAPI service for real-time inference. The entire system is Dockerized, ensuring easy deployment to cloud environments for immediate use in sales operations.