Skip to main content

Cost anomalies

Overview

DoiT Cloud cost anomaly detection offers end-to-end monitoring of spikes in your Google Cloud and Amazon Web Services costs across all your projects and services.

The detection service leverages machine learning algorithms to monitor billing data and analyzes the trend of spending in your cloud environment. It identifies billing patterns across DoiT customers, forecasts your cloud spending, and constantly improves itself to provide even more accurate results.

Billing records that don't align with your anticipated spending behavior are identified as potential anomalies. You can also get insights into which resources are causing the anomalies and take corrective actions if necessary.

Caution

The data analysis begins as soon as you sign up. However, for anomaly detection to work properly, we need at least seven full days of reference data in a specific project.

In case anomaly detection is critical to your operation, we recommend that you wait this seven-day period out before making significant changes to your cloud spending.

Note

Required Permissions:Attributions Manager, Anomalies Viewer, Cloud Analytics

Access anomalies

The DoiT Platform stores all the detected cost anomalies.

To access cost anomalies, select Governance from the top navigation menu, and then select Cost anomalies.

The cost anomalies page.

  • Start Time: The start time of the hourly usage window on which the aggregated cost exceeds the predefined threshold and is considered a potential anomaly. The time value comes from the billing data by the cloud providers: for AWS it is the lineItem/UsageStartDate; for Google Cloud it is the usage_start_time.

  • Service: See Standard dimensions: Services. Note that the anomaly detection system evaluates anomalies per service per project/account across regions, it doesn't evaluate multiple services in a project/account as a whole.

  • Project/Account: See Standard dimensions: Project/Account ID.

  • Billing Account ID: For Google Cloud it refers to the Cloud Billing account ID that the usage is associated with. For AWS it is either your DoiT customer ID (if you're on a Dedicated payer account) or your CloudHealth account ID (if you're on a Consolidated billing account).

  • Platform: The cloud provider, either Amazon Web Services or Google Cloud.

  • Time Frame: Whether the anomaly is triggered based on the Hourly or Daily time series of the usage and cost data.

  • Attribution: The group of resources being monitored. By default, the anomaly detection service monitors two preset attributions: All AWS Resources and All GCP Resources. You can also Configure custom anomalies scopes to monitor a specific group of resources.

  • Anomaly: A thumbnail image of the anomaly chart.

  • Severity: The severity level of the anomaly. There are three severity levels: Information, Warning, and Critical. They're defined by DoiT in accordance with the extent to which the actual cost deviates from the established pattern.

  • Cost of anomaly: The difference between the actual cost and the maximum cost in the normal range.

  • Details: You can select the View button in this column to view the details of the specific anomaly.

Configure custom anomalies scopes

To monitor a specific subset of costs in your accounts, you can use attributions to define a custom scope.

For example, to monitor only your production infrastructure, you can create an attribution that defines your environment and then enable monitoring for cost anomalies on it.

FAQ

What is the latency of cost anomaly detection?

In most cases, an anomaly is reported within 12 hours once the aggregated cost exceeds the predefined threshold.

The anomaly detection engine checks usage and cost data hourly. The latency mainly relates to the varying intervals at which cloud providers report usage and cost data.

See also AWS cost data latency in DoiT Console and Google Cloud's frequency of data loads.

Why was a spike in my costs not reported as an anomaly?

The anomaly detection system evaluates costs per service, it doesn't evaluate the combined costs of multiple services.

If a spike in your cloud costs was not detected as an anomaly, it's important to first assess whether the spike was caused by more than one service.

In addition, the spend of a service must meet all the following criteria to qualify as an anomaly:

  1. The hourly spend of the service at the usage time considered is at least US$5.

  2. The daily spend of the service is at least US$90.

  3. The daily spend exceeds monthly seasonality.