Skip to main content

Dynamic Baselines

Dynamic baselines allow users to create alert rules that more accurately reflect the natural variance in test data. Using standard deviation, percentage change, or absolute values, you can configure alert rules that dynamically determine whether to fire or not, based on the historical data of a specific test.

Note: Dynamic baselines are currently only available for Cloud and Enterprise Agent alerts.

24-Hour Window for Dynamic Baselines

When you configure an alert rule that uses a dynamic baseline in its alert conditions, the ThousandEyes platform looks back at at 24-hour window to determine the baseline of a specific metric.

With a 24-hour window, the baseline for the metric is based on at least 24 data points (24 hours * 1-hour test interval). This ensures that alerts are triggered by a valid deviation from a reliable baseline. For additional accuracy, the baseline is updated every 5 minutes, to accommodate any new data from new test rounds.

3-Hour Window for Standard Deviation Dynamic Baselines

When your dynamic baseline is calculated based on a standard deviation, the ThousandEyes platform looks back at a 3-hour window. (The average baseline is still based on the 24-hour window.) This is to ensure that any deviation from the mean is in line with the most recent data points and mitigates the risk of an extremely steady standard deviation over a 24-hour period, which would prevent an alert from firing that might be of interest to our customers.

Example Scenario

Let's imagine a scenario where an HTTP server test runs every ten minutes. Over the course of 24 hours, an agent in New York runs the test 144 times. The test gathers response times with an average of 500ms.

Attached to this test is an alert rule that uses a dynamic baseline. Based on the results so far, whether an alert will fire or not for the next test round depends on whether it was configured using standard deviation, percentage change, or an absolute value:

  • Standard deviation - Say that the standard deviation for last 3 hours' results is 36. Using the default multiplier of 2, the alert will fire if the next test round returns a response time greater than 500+(36x2) = 572ms.

  • Percentage change - Using the default percentage change of 20%, the alert will fire if the next test round returns a response time greater than 500+20% = 600ms.

  • Absolute value - Using the default absolute value of 50ms, the alert will fire if the next test round returns a response time of 500+50 = 550ms.

The different options allow you to adapt your alerting framework to better reflect the fluctuation in test results, and ensure that your system isn’t overwhelmed with alerts because of static metric baselines.

Metrics for Dynamic Baselines

Dynamic baselines are currently supported for the following metrics:

Cloud and Enterprise Agents

Alert TypeMetric
DNS - DNS serverResolution time
Network – Agent-to-agentJitter
Network - Agent-to-agentLatency
Network – Agent-to-serverJitter
Network - Agent-to-serverLatency
Network – Agent-to-serverProxy jitter
Network – Agent-to-serverProxy latency
Voice - RTP streamLatency
Voice – RTP StreamPacket delay variation
Voice – SIP serverConnect time
Voice - SIP serverDNS time
Voice – SIP serverInvite time
Voice – SIP serverOptions time
Voice – SIP serverRegister time
Voice – SIP serverResponse time
Voice - SIP serverTotal time
Voice – SIP serverWait time
Web – FTP ServerConnect time
Web - FTP serverDNS time
Web – FTP ServerFTP negotiation time
Web – FTP ServerResponse time
Web - FTP serverSSL negotiation time
Web - FTP serverTotal time
Web – FTP ServerTransfer time
Web – FTP ServerWait time
Web - HTTP serverConnect time
Web - HTTP serverDNS time
Web - HTTP serverReceive time
Web - HTTP serverResponse time
Web - HTTP serverSSL negotiation time
Web - HTTP serverTotal time
Web - HTTP serverWait time
Web - Page LoadDOM load time
Web - Page loadPage load time
Web - Page loadResponse time
Web - TransactionTransaction time

Endpoint Agents

Alert TypeMetric
Network - Endpoint End-to-End (Server)Jitter
Network - Endpoint End-to-End (Server)Latency
Web - Endpoint HTTP-ServerConnect Time
Web - Endpoint HTTP-ServerDNS Time
Web - Endpoint HTTP-ServerReceive Time
Web - Endpoint HTTP-ServerResponse Time
Web - Endpoint HTTP-ServerSSL Negotiation Time
Web - Endpoint HTTP-ServerThroughput
Web - Endpoint HTTP-ServerTotal Time
Web - Endpoint HTTP-ServerWait Time

Creating Alert Rules with Dynamic Baselines

The image below shows an example alert rule that uses a dynamic baseline. The alert rule condition states that if, within the last 24 hours' average, the response time exceeds the last 3 hours' standard deviations by a factor of 2, the alert will fire.

Inspecting Alerts Triggered by a Dynamic Baseline

When an alert is triggered based on a dynamic-baseline alert rule, you can inspect the dynamic baseline used. In the ThousandEyes platform, go to Alerts > Alert List. On the Active Alerts or Alert History tab, hover over the tooltip to see information about the alert rule that generated this alert.

Mitigating Noisy Dynamic Baseline Alerts

Dynamic baseline alerts that are based on standard deviation can be very "noisy" for metrics with a small or very stable average. For example, the standard deviation of latency for a service could be less than 1ms. If your service jumped from 20ms to 20.4ms, this isn't inherently cause for concern, but if you have configured a sensitive dynamic baseline alert rule, alerts could fire regularly and increase noise.

When you configure an alert rule with a standard deviation-based dynamic baseline condition, we recommend that you add to the alert rule an additional alert condition with an absolute difference from average. For example, you can add a condition that says latency > 5ms above the mean. This will ensure that your alert rule will only fire if it is above the standard deviation and above a certain absolute threshold compared to your average.