An identity platform like ForgeRock is the backbone of an enterprise, with a view of all apps, identities, devices, and resources attempting to connect with each other. This is a very nice position to gather rich log identity data to use to prevent data breaches. In my previous blog, I discussed how we detect data breaches using identity logs. Now I am back to discuss how we test accuracy of our breach prevention algorithms, because the last thing you want to do is introduce false positives that put friction into your identity flows.
Building Metrics To Test Algorithm Accuracy
In order to measure accuracy, we have to build our measuring stick, which comes in the form of a series of metrics against which we can evaluate the algorithms:
Core Metrics: We use multi-stage Data and ML pipelines and embed different metrics into each stage to measure effectiveness of our models and pipelines. We introduce various weighted scores to measure the model accuracy, computation latency, and efficiency of our pipelines.
Business Metrics: We put some context around our metrics because we know we are working with identity use cases. Here our job is to build realistic correlation between core metrics and business metrics, without which we will not be able to gauge success/failure of the models. We track Anomalies Detected, Positive Action Rate, Negative Action Rate and False Anomalies Detection Rate, and many other relevant metrics. These metrics measure real world health of our ML models and help in making executive decisions.
Are more metrics better? Not always. Sometimes more metrics can lead to confusion. We constantly audit/modify our business and core metrics. Our core metrics are used for tracking health of our models and pipelines and are also used in aggregating to provide insights into our business metrics.
Using A/B Testing To Reduce Risk and Learn More
Thanks to our metrics work, we now are in a place where we trust our algorithms but we constantly want to make them better, smarter, and faster. A/B testing gives us a way to grow our capability safely.
A/B testing of Models: A/B testing helps us to release upgraded model version to a controlled set of users. This makes it easier to target our customer base and collect qualitative metrics from the A/B testing effort.
Truly Random or Controlled Random: We prefer a uniform weighted controlled random sampling for our A/B testing. This helps in controlling new model rollouts and also helps in making sure customer experience is not affected during a phased rollout of our models.
Going Back In Time Helps Build Trust
When we modify/refine/tweak anomaly algorithms, we can run the new version against historic data. This is data we know and trust and have metrics for, which gives us more confidence in accuracy. This back testing involves random sampling of historic data with different cross-validation methods to test for divergence in our core metrics.
The Future Is Exciting; Let’s Collaborate
In this two part series, we discussed how ForgeRock leverages Artificial Intelligence (AI) to prevent data breaches. We have been able to successfully leverage AI to detect anomalies and avert breaches. We continue to pioneer advanced features and techniques to make our platform and ML models faster and better in detecting and averting breaches. We love partnering with ForgeRock customers in building our algorithms. If you are a current customer with interest in anomaly detection on identity logs, we’d love to collaborate with you! Please reach out to your ForgeRock representative if you are interested. A special thanks to Mary Writz for helping in proofreading this post.
Prevent Data Breaches: Find Out More