Speaker "Keira Zhou" Details Back



Feature drift monitoring as a service for machine learning models at scale


At Capital One, we have hundreds of models in production to fight fraudsters, improve customer experience, and make business decisions. Monitoring is a critical and required capability to understand what’s happening so that problems can be prevented or remediated in a timely fashion. Our team provides the common capability for the enterprise to visualize statistical characteristics and detect various shifts on our machine learning model inputs. The monitoring solution uses both rule based and model based techniques. We collect central tendency measures, variability measures, and Population Stability Index (PSI). These metrics can be observed by both the feature producer and the feature consumers, comprised of data scientists, engineers, model risk officers, and stakeholders . They allow each tech team to set their tailored thresholds for their use cases to identify data volatility that might negatively impact the model output. We also offer a fully automated approach that leverages anomaly and change point detection models. During this talk, we will walk through our implementation details of how we monitor our feature data for various model use cases at Capital One. Our technology stack includes Kafka, Spark, Airflow, Kubernetes, InfluxDB, Grafana and GraphQL.
Who is this presentation for?
Data scientists, data engineers, data analysts
Prerequisite knowledge:
Basic understanding of machine learning and data engineering
What you'll learn?
Ideas on how to monitor feature drift for machine learning model pipelines


Keira is a data engineering manager at Capital One who has hands-on expertise in both Data Engineering and Data Science. She has built streaming and batch data pipelines, deployed machine learning models in production, and is currently focusing on feature monitoring. Prior to Capital One, Keira has received her master degree focusing on Nature Language Processing.