Billing data schema
This page provides the schema definition for the billing data available in DoiT Cloud Intelligence. The schema defines the structure of the billing data tables available for the CloudFlow SQL node.
You can query your data using the provider-specific table names.
Cloud providers
You can query billing data from the following cloud providers, depending on the cloud environment you are using.
AWS
The following table defines the data for the aws_dci table.
| Name | Type | Description | Sample Values |
|---|---|---|---|
| service_description | STRING | The human-readable name of the AWS service being billed. | Amazon Elastic Compute Cloud |
| service_id | STRING | The technical identifier (code) for the AWS service. | AmazonEC2 |
| sku_description | STRING | The detailed, human-readable description of the priced SKU or usage type. | EU (Ireland) On Demand Linux m5.large Instance Hour |
| sku_id | STRING | The AWS SKU or rate code used to price this line item. | ABCDE12345FGHI67890 |
| operation | STRING | The specific AWS API or usage operation that generated the cost. | RunInstances:0002 |
| description | STRING | A short text description of this charge line, often detailing the usage context. | Compute instance usage |
| usage_start_time | TIMESTAMP | The start time of the metered usage window for this row. | 2025-12-01 10:00:00.000000 UTC |
| usage_end_time | TIMESTAMP | The end time of the metered usage window for this row. | 2025-12-01 11:00:00.000000 UTC |
| project_id | STRING | The AWS Account ID or unique identifier for the account. | 123456789012 |
| project_name | STRING | The display name associated with the AWS account. | Production-Web-Account |
| project_number | STRING | The numerical identifier for the AWS account (same as project_id). | 123456789012 |
| project_labels | RECORD | Customer-defined tags attached at the account or cost allocation level. | {"Owner": "Finance-Team"} |
| labels | RECORD | Customer-defined tags on the specific resource or usage line item. | {"Environment": "Prod"} |
| system_labels | RECORD | AWS-generated metadata or tags (e.g., resource type, tenancy). | {"aws:createdBy": "Terraform"} |
| region | STRING | The AWS region where the usage ran or is billed. | eu-west-1 |
| zone | STRING | The Availability Zone where the resource was located (when available). | eu-west-1a |
| cost_type | STRING | Type of cost (e.g., Usage, Recurring Fee, Tax, Credit, Refund). | Usage |
| resource_id | STRING | The AWS specific ID of the underlying resource (e.g., EC2 Instance ARN). | arn:aws:ec2:eu-west-1:123... |
| usage.amount_in_pricing_units | FLOAT | Quantity used for pricing (e.g., hours, GB, requests) in provider units. | 1.0 |
| usage | FLOAT | The usage quantity aggregated for reporting purposes. | 1.0 |
| cost | FLOAT | The net cost for this row in the billing currency after pricing and discounts. | 0.096 |
| credit | STRING | The credit amount aggregated for reporting purposes (often represented as a negative cost). | -0.015 |
Azure
The following table defines the data for the azure_dci table.
| Name | Type | Description | Sample Values |
|---|---|---|---|
| service_description | STRING | The human-readable name of the Azure service being billed. | Virtual Machines |
| service_id | STRING | The technical identifier for the Azure service. | microsoft.compute |
| sku_description | STRING | The detailed, human-readable description of the priced SKU or usage type. | Standard, D2, v3, VM |
| sku_id | STRING | The Azure SKU or rate code used to price this line item. | DZN-0036 |
| operation | STRING | The specific API or usage operation that generated the cost. | VirtualMachines/allocate |
| description | STRING | A short text description of this usage or charge line. | Windows, D2,v3, Instance, Hour |
| usage_start_time | TIMESTAMP | Start time of the metered usage window for this row. | 2025-12-01 10:00:00.000000 UTC |
| usage_end_time | TIMESTAMP | End time of the metered usage window for this row. | 2025-12-01 11:00:00.000000 UTC |
| project_id | STRING | The Azure Subscription ID where the usage occurred. | 12345678-abcd-efgh-ijkl-123456789012 |
| project_name | STRING | The display name associated with the Azure Subscription or Resource Group. | Production-Web-Subscription |
| project_number | STRING | The numerical identifier for the subscription or project. | 987654321012 |
| project_labels | RECORD | Customer-defined tags applied at the subscription or resource group level. | {"Owner": "DevOps"} |
| labels | RECORD | Customer-defined tags on the specific resource or usage line item. | {"CostCenter": "CC1001"} |
| system_labels | RECORD | Azure-generated metadata or tags (e.g., resource type). | {"resourceType": "Microsoft.Compute/virtualMachines"} |
| region | STRING | The Azure region where the usage ran or is billed. | eastus |
| zone | STRING | The Availability Zone where the resource was located (when available). | 1 |
| cost_type | STRING | Type of cost (e.g., Usage, Fee, Tax, Discount). | Usage |
| resource_id | STRING | The Azure specific ID of the underlying resource (e.g., VM Resource ID). | /subscriptions/123/resourceGroups/rg-prod/... |
| usage.amount_in_pricing_units | FLOAT | Quantity used for pricing (e.g., hours, GB, requests) in provider units. | 1.0 |
| report.usage | FLOAT | The usage quantity aggregated for reporting purposes. | 1.0 |
| report.cost | FLOAT | The net cost for this row in the billing currency after pricing and discounts. | 0.0450 |
| report.credit | STRING | The credit amount aggregated for reporting purposes (often represented as a negative cost). | -0.0050 |
Google Cloud
The following table defines the data for the gcp_dci table.
| Name | Type | Description | Sample Values |
|---|---|---|---|
| billing_account_id | STRING | The unique ID of the Google Cloud billing account responsible for the charges. | 01A2B3-4C5D6E-7F8G9H |
| service_description | STRING | The human-readable name of the cloud service consumed. | Compute Engine |
| service_id | STRING | The technical identifier for the Google Cloud service. | 6F81-5844-456A |
| sku_description | STRING | The detailed, human-readable description of the billable unit. | N1 Predefined Instance Core running in Americas |
| sku_id | STRING | The unique identifier for the specific billable product or usage unit. | B53E-74B8-60C9 |
| usage_start_time | TIMESTAMP | The start of the granular time interval for which the usage was recorded. | 2025-12-01 10:00:00.000000 UTC |
| usage_end_time | TIMESTAMP | The end of the granular time interval for which the usage was recorded. | 2025-12-01 11:00:00.000000 UTC |
| project_id | STRING | The unique ID of the Google Cloud project where the usage occurred. | my-production-project-12345 |
| project_name | STRING | The user-defined name of the project. | My Production Application |
| project_number | STRING | The project's unique numerical identifier. | 123456789012 |
| project_labels | RECORD | Custom tags applied directly to the project. | {"environment": "prod"} |
| labels | RECORD | Custom user-defined labels/tags applied to the resource for cost allocation. | {"team": "frontend"} |
| system_labels | RECORD | Labels automatically applied by Google Cloud for resource categorization. | {"resource_type": "instance"} |
| country | STRING | The country where the resource is hosted. | US |
| region | STRING | The specific regional location. | us-central1 |
| zone | STRING | The specific zone within the region. | us-central1-a |
| cost_type | STRING | The nature of the financial transaction. | Usage |
| resource_id | STRING | The specific identifier of the resource itself (e.g., a VM instance ID). | 0123456789012345678 |
| resource_global_id | STRING | A unique, globally scoped identifier for the resource. | gcp-us-central1-12345 |
| usage.amount_in_pricing_units | FLOAT | The amount of usage normalized to the price unit. | 3600.0 |
| kubernetes_cluster_name | STRING | The name of the GKE or Anthos cluster associated with the usage. | production-cluster-a |
| kubernetes_namespace | STRING | The specific Kubernetes namespace where the usage occurred. | default |
| seller_name | STRING | Identifies the seller, usually 'Google' or a third-party marketplace vendor. | Google |
| subscription.instance_id | STRING | The ID of the specific subscription instance that covers this usage. | sub-a1b2c3d4 |
| transaction_type | STRING | The specific type of billing transaction. | USAGE |
| usage | FLOAT | The usage quantity aggregated for reporting purposes. | 1.0 |
| cost | FLOAT | The net cost for this row in the billing currency after pricing and discounts. | 0.0245 |
| credits | STRING | The credit amount aggregated for reporting purposes (often represented as a negative cost). | -0.0055 |
Third-party platforms
You can query billing data from the following third-party platforms, depending on the third-party platforms you have connected to DoiT Cloud Intelligence.
Databricks, Datadog, MongoDB, Snowflake
The following table defines the data for the databricks_dci, datadog_dci, mongodb_dci and snowflake_dci tables.
| Column Name | Column Type | Description | Sample Values |
|---|---|---|---|
| etl | RECORD | A nested field containing processing metadata from the ETL (Extract, Transform, Load) pipeline. | N/A (See nested fields below) |
| billing_account_id | STRING | The unique financial account ID (e.g., Databricks workspace billing, Snowflake account). | DB-12345-ABCDE |
| project_id | STRING | The core organizational identifier (e.g., Databricks workspace ID, MongoDB Project ID, Snowflake Account Name). | prod_databricks_us-west |
| service_description | STRING | The human-readable name of the specific service consumed. | Databricks SQL Warehouse Compute |
| service_id | STRING | The unique technical identifier for the platform service. | databricks_sql_compute |
| sku_description | STRING | The detailed, human-readable description of the billable unit. | Databricks DBU Standard (SQL) |
| sku_id | STRING | The unique identifier for the specific billable product or usage unit. | DBU_STANDARD_SQL |
| usage_date_time | DATETIME | The primary date and time when the resource consumption took place. | 2025-12-01 10:00:00 |
| usage_start_time | TIMESTAMP | The start of the granular time interval for which the usage was recorded. | 2025-12-01 10:00:00.000000 UTC |
| usage_end_time | TIMESTAMP | The end of the granular time interval for which the usage was recorded. | 2025-12-01 11:00:00.000000 UTC |
| project | RECORD | A nested field containing metadata about the organizational container where the usage occurred. | N/A (See nested fields below) |
| labels | REPEATED RECORD | Custom user-defined tags/labels applied to the resource or usage event. | N/A (See nested fields below) |
| system_labels | REPEATED RECORD | Labels automatically applied by the platform for categorization. | N/A (See nested fields below) |
| location | RECORD | A nested field containing geographical details about where the service was executed or data stored. | N/A (See nested fields below) |
| export_time | TIMESTAMP | The time this specific billing record was exported or finalized by the platform. | 2025-12-02 05:30:00.000000 UTC |
| cost | FLOAT | The total cost of the usage line item before any applicable taxes or adjustments. | 1.45 |
| currency | STRING | The currency in which the cost is reported. | USD |
| currency_conversion_rate | FLOAT | The rate used to convert the billing currency to the reporting currency. | 1.0 |
| usage | RECORD | A nested field describing the quantity consumed and its units. | N/A (See nested fields below) |
| invoice | RECORD | A nested field containing billing period information. | N/A (See nested fields below) |
| cost_type | STRING | The nature of the financial transaction. | USAGE |
| report | REPEATED RECORD | A list of aggregated reporting metrics (cost, usage, savings) for this line item. | N/A (See nested fields below) |
| resource_id | STRING | The specific identifier of the resource generating the cost (e.g., Warehouse ID, MongoDB Cluster Name). | wh-prod-analytics-01 |
| operation | `STRING | The specific action or API call that generated the usage. | COMPUTE_QUERY_EXECUTION |
| row_id | STRING` | A unique identifier for this specific billing record. | 0a1b2c3d-4e5f-6789-0abc-def123456789 |
| description | STRING | A more detailed text description of the usage event. | Snowflake Virtual Warehouse X-Small consumption |
| customer_type | STRING | The categorization of the customer. | Enterprise |
| is_marketplace | BOOLEAN | A flag indicating whether the product/service was acquired via a marketplace. | false |
| customer_id | STRING | The unique identifier for the customer associated with this usage. | CUST-98765 |
| etl.ts | TIMESTAMP | The primary timestamp of when the record was processed by the ETL pipeline. | 2025-12-02 06:00:00.000000 UTC |
| etl.session_id | STRING | The unique string identifier linking this record to the ETL processing session. | session-flow-101 |
| etl.run_start_time | TIMESTAMP | The specific time the ETL job responsible for this record began execution. | 2025-12-02 05:00:00.000000 UTC |
| project.id | STRING | The project's unique short identifier (e.g., Workspace ID). | 123456789012 |
| project.name | STRING | The user-defined name of the project or account. | Data Analytics |
| project_labels | REPEATED RECORD | Custom tags applied directly to the project or account. | N/A (See nested fields below) |
| project.labels.key | STRING | The key of a custom project label. | Data_Tier |
| project.labels.value | STRING | The value of a custom project label. | Gold |
| project.ancestry_numbers | STRING | The numerical hierarchy path of the resource within the organization. | 123/456/789 |
Gen AI
The following table defines the data for the genai_dci table.
| Column Name | Column Type | Description | Sample Values |
|---|---|---|---|
| provider | STRING | The name of the cloud provider or platform where the Gen AI service was consumed. | GCP |
| billing_account_id | STRING | The unique ID of the financial account responsible for the Gen AI charges. | A1B2C3-4D5E6F-7G8H9I |
| project_id | STRING | The unique identifier for the cloud project or resource group where the usage occurred. | gen-ai-experiments-001 |
| project_internal_id | STRING | An internal, unique identifier for the project, separate from the user-facing project_id. | 123456789012 |
| project_number | STRING | The numerical identifier for the project or subscription. | 987654321012 |
| project_name | STRING | The human-readable name of the project. | AI Development Sandbox |
| project_ancestry_numbers | STRING | The numerical path of the project within the organization's hierarchy (e.g., Organization/Folder/Project). | 12345/67890/98765 |
| service_description | STRING | The human-readable name of the Gen AI service or API consumed. | Generative Language API |
| service_id | STRING | The technical identifier for the Gen AI service. | aiplatform.googleapis.com |
| sku_id | STRING | The unique identifier for the specific billable model or usage unit. | sku_inference_model_x |
| sku_description | STRING | The detailed, human-readable description of the billable unit (e.g., tokens, model version). | Code Generation - 1M tokens |
| usage_date_time | DATETIME | The primary date and time when the resource consumption took place. | 2025-12-01 10:30:00 |
| usage_start_time | TIMESTAMP | The start of the time interval for which the usage was recorded. | 2025-12-01 10:00:00.000000 UTC |
| usage_end_time | TIMESTAMP | The end of the time interval for which the usage was recorded. | 2025-12-01 11:00:00.000000 UTC |
| export_time | TIMESTAMP | The time this specific usage record was exported by the cloud provider. | 2025-12-02 05:00:00.000000 UTC |
| location | STRING | The high-level geographical location where the resource was hosted. | us-central |
| country | STRING | The country where the resource is hosted. | US |
| region | STRING | The specific regional location (e.g., us-central1). | us-central1 |
| zone | STRING | The specific availability zone within the region (if applicable). | us-central1-a |
| cost | FLOAT | The total cost of the usage line item before any applicable taxes or adjustments. | 0.0025 |
| currency | STRING | The currency in which the cost is reported. | USD |
| currency_conversion_rate | FLOAT | The rate used to convert the billing currency to the reporting currency. | 1.0 |
| cost_type | STRING | The nature of the financial transaction. | Usage |
| usage_amount | FLOAT | The total numeric amount of usage consumed (e.g., the number of tokens). | 125000.0 |
| usage_unit | STRING | The physical unit of measurement for the consumed amount. | token |
| usage_amount_in_pricing_units | FLOAT | The usage amount normalized to the pricing unit (e.g., dividing tokens by 1,000,000). | 0.125 |
| usage_pricing_unit | STRING | The unit used by the billing rate. | 1000000 token |
| invoice_month | STRING | The billing month the charge is associated with (e.g., YYYYMM). | 202512 |
| resource_id | STRING | The specific identifier of the resource associated with the usage (e.g., a specific endpoint ID). | model-endpoint-abc |
| customer_type | STRING | The categorization of the customer. | Direct |
| is_marketplace | BOOLEAN | A flag indicating whether the Gen AI service was acquired via the marketplace. | false |
| labels | REPEATED RECORD | Custom user-defined tags applied to the usage for cost allocation. | N/A (See nested fields below) |
| system_labels | REPEATED RECORD | Labels automatically applied by the cloud provider for resource categorization. | N/A (See nested fields below) |
| labels.key | STRING | The key of a custom label. | environment |
| labels.value | STRING | The value of a custom label. | staging |
| system_labels.key | STRING | The key of a system-applied label. | ai.google.com/model_version |
| system_labels.value | STRING | The value of a system-applied label. | gemini-pro-v2 |