Dynamic Threshold Estimation For Anomaly Detection

版主: 论坛版主

回复
riyasimla200
新人报道
帖子: 1
注册时间: 11 9月 2022, 11:00

Dynamic Threshold Estimation For Anomaly Detection

帖子 riyasimla200 » 11 9月 2022, 11:00

The quest for time-series anomaly detection at Sinch part two Many infrastructure and performance monitoring software tools offer built-in anomaly detection. But they often generate too many false positives. This is the second blog post in a series where we describe our journey in building a better performance monitoring tool for chatbots. You can find part one here. Anomaly detection can also be formulated as a prediction problem. Anomalies are unexpected events, which makes them hard to predict. If you build a system that can predict the value of the next measurement quite well, you can compare that prediction with the actual measurement.

If there is a large difference between what you predicted and what you measured an anomaly probably occurred. This blog post dives deeper into the methods we used to try and find out how big that difference should be before we consider it to be an anomaly. Statistical Benin Phone Number estimation There are many machine learning methods out there to predict values for time-series. Some of them even come with tools to estimate confidence boundaries like ARIMA or Gaussian Processes. The output is typically a Gaussian, which tells you how likely it is that the actual measurement will fall between certain boundaries. A commonly used approach is the 3 Sigma rule.

图片

If your measurement is more than three standard deviations away from your average this measurement is considered an anomaly. But as illustrated in the image above, you still have a 0.1% probability that your prediction was incorrect; and that the measurement was, in fact, an expected value. Using this approach you actually build an outlier detection, but statistical outliers aren’t necessarily anomalies. Imagine you have 100 corporate customers active in 100 countries. If you want to monitor their calls for each country every minute in real-time, you will have 10000 data points per minute for your anomaly detection algorithm to monitor. If you apply the 3 Sigma rule, you still have a 0.1% probability that your model might make an error.

回复