BigQuery Lens
Introduction
Google Cloud BigQuery is a serverless, multi-cloud data analytics platform. Its pricing mainly consists of two parts: compute pricing for the cost to process queries, and storage pricing for the cost to store data.
DoiT BigQuery Lens analyzes your BigQuery usage and generates actionable recommendations for both compute and storage, helping you maximize the efficiency of your cloud spend, improve query performance, and ensure sustainable growth as your data scales.
Dataset and audit log sink
BigQuery Lens creates the following in your billing project:
-
A dataset named
doitintl-cmp-bq
that is created when you updated your service account with the required permissions and removed when you removed the service account. -
An audit log sink named
<Billing_Project_ID>.doitintl-cmp-bq.cloudaudit_googleapis_com_data_access
for all your BigQuery jobs.
Queries run by BigQuery Lens
BigQuery Lens runs the following processes:
-
Table discovery process: Collects table metadata. This process runs on a schedule. It writes the table metadata it collected into a table named
tables_discovery
in thedoitintl-cmp-bq
dataset. -
Dashboard update process: Updates the BigQuery Lens dashboard.
- Daily run of cost simulation queries to provide the cost savings information and potential savings.
- Daily run of lens calculation queries to provide detailed BigQuery usage.
-
Reservation mapping process: If you have workloads on the capacity pricing model, this process maps the edition and capacity commitments in use when any job completed. It runs once per hour and writes the data it collected into the following tables in the
doitintl-cmp-bq
dataset.- A
reservations_mapping_history
table that contains a list of reservations with their associated editions and the projects they're allocated to. - A
capacity_commitments_history
table that contains information about the capacity commitments that were active on that day.
- A
BigQuery Lens cost
BigQuery Lens creates and maintains the doitintl-cmp-bq
dataset on your behalf and runs queries in your billing project to provide you with analytics data.
The actual cost of running the queries depends on the volume of your logs and the compute pricing model you choose. If the billing project is assigned to a reservation, BigQuery Lens jobs use slots from the assigned reservation.
If you have an excess of historical jobs, you may see an increased cost for BigQuery Streaming Inserts when we backfill them for the last 30 days during your service account setup.
Slots autoscaling
BigQuery compute capacity is measured in slots. When reserving BigQuery capacity, you can enable BigQuery editions and create a reservation or a capacity commitment associated with an edition.
If you use reservations with baseline and autoscaling slots, be aware of the following:
-
You're always charged for the baseline slots allocated to the reservation. Autoscaling slots are only added after all of the baseline slots (and idle slots if applicable) are consumed.
-
The number of slots always scales by multiples of 50, it may scale more than 50 slots within one step. You are charged for the number of scaled slots, regardless of whether they are used.
-
Autoscaled capacity is retained for at least 60 seconds (scale-down window).
When calculating BigQuery costs, we use the RESERVATIONS_TIMELINE view to track changes in reservations over time and run a reservation mapping process to find out the edition and commitments in use when a job completed.
When there are multiple jobs during an autoscaling event, we use a price based on a weighted average as jobs that have used baseline slots are not distinguished from those that have used autoscaled slots. As a result, while the overall totals are largely accurate, the cost of an individual job or reservation may not be.
Required permissions
-
To access BigQuery Lens, your DoiT account needs the Cloud Analytics permission.
-
BigQuery Lens needs specific permissions at the organization level. See BigQuery Lens permissions for details.
Set up BigQuery Lens
-
Connect your Google Cloud Organization to the DoiT Platform. Make sure to allow service account impersonation when configuring identities for DoiT workloads.
-
Update your service account to grant the permissions required by BigQuery Lens, BigQuery Lens Editions, and BigQuery Lens Insights.
NoteIn the current implementation, BigQuery Lens works with a single service account. If you grant permissions to multiple service accounts, BigQuery Lens works only for the first one.
Once set up, the BigQuery Lens will backfill historical data for the last 30 days and start gathering information about your Google Cloud BigQuery usage patterns. It can take up to 24 hours to fully populate the BigQuery Lens dashboard with statistics and recommendations.
BigQuery Lens Advanced
BigQuery Lens Advanced provides clustering recommendations based on the following assumptions:
- You'll continue to execute similar queries as you did in the previous 30 days, and
- You'll reference fields in the order that we recommend.
To set up BigQuery Lens Advanced, you also need to enable Google Cloud Resource Manager API, which manages metadata for Google Cloud Platform resource containers:
-
Sign in to the Google Cloud console, go to the Cloud Resource Manager API page.
-
Select the project where you created your service account.
-
Select Enable.
BigQuery Lens dashboard
The BigQuery Lens dashboard consists of widgets that highlight different aspects of your BigQuery usage.
General settings:
-
Time range: Adjusts the time frame of the data you're looking at. Options: Last day, Last 7 days, and Last 30 days.
-
Usage type: Filters queries by pricing models. Options:
-
On-Demand: Queries that are billed using the on-demand compute pricing model.
-
Standard Edition, Enterprise Edition, Enterprise Plus Edition: Queries that are billed using the capacity compute pricing model of BigQuery editions.
-
Flat-Rate Legacy: The BigQuery flat-rate pricing model is no longer offered. This option applies only to customers who have existing flat-rate commitments and haven't upgraded to DoiT Cloud Intelligence yet.
-
Widgets
Except BigQuery Spend by SKU, all other widgets on the BigQuery Lens dashboard are updated daily at 01:00 AM UTC.
Recommendations
The Recommendations widget shows actionable insights into BigQuery optimizations that are relevant to your usage. You can select individual recommendations to see the details.
Compute recommendations
-
Backup and remove unused tables: This recommendation applies to all usage types. It lists the unused tables that you can back up and then remove from BigQuery to reduce storage costs.
Below is an example:
-
The
Project
column shows the billing project. See Project and Billing project. -
Choose a table in the
Table
column to open it in your Google Cloud BigQuery console. -
If the table has multiple partitions, select the number in the
Partition(s) to Remove
column to see which partitions should be removed. -
You can also download the full list as a CSV file.
-
On-demand pricing model recommendations
-
Limit query jobs: Reduce execution frequency of the listed jobs in the
Query ID
column by the percentage you choose on the slider and view theSavings by Reducing Jobs
. -
You also get recommendations on partitioning or clustering BigQuery tables via Insights if applicable.
Capacity pricing model recommendations
If you use BigQuery editions with 1-yr or 3-yr slot commitments, or if you have Flat-Rate Legacy usage, you will see the following recommendation when applicable:
- Move recurring queries to a different time slot: Lower your minimum average slots requirements by moving the recurring queries to a less-busy hour of the day. Review the Slots Consumption widget with the interval set to
Hour
to identify suitable hours. Troughs in your daily slots consumption indicate optimal hours to run recurring queries.
Storage recommendations
-
Change dataset to Physical Storage: Change a dataset's storage billing model to pay for the compressed physical storage used by tables in the dataset.
You can open the recommendation link to view the detailed information on datasets that could benefit from using the physical storage billing model, including the project, table, logical storage size, physical storage size, logical storage cost, physical storage cost, and potential savings.
-
The
Project
column shows the billing project. See Project and Billing project. -
Select a table name in the
Table
column to open it in your Google Cloud BigQuery console. -
You can also download the full list as a CSV file.
NoteReview Google's physical storage documentation for eligibility criteria. The BigQuery Lens recommendation doesn't perform eligibility check.
-
Explorer
Use the Explore widget to get granular information about your BigQuery usage. You can select individual entries to drill down into the data.
The metrics available vary with the scope and usage type.
Usage type | Scope | Metric |
---|---|---|
On-Demand, Edition (pay as you go) | Billing Project, User | Scan Price, Scan TB |
On-Demand, Edition (pay as you go) | Project, Dataset, Table | Scan Price, Scan TB, Storage Price, Storage TB |
Edition (one year and three year commitments), Flat-Rate Legacy | Billing Project, User | Average Slots |
Edition (one year and three year commitments), Flat-Rate Legacy | Project, Dataset, Table | Storage Price, Storage TB |
The example below shows the scan price of a chosen user. You can also select the individual queries run by the user to view the price.
Project and Billing project
-
The Project contains the data that is queried during the job. It is called data project in the BigQuery audit logs overview.
-
The Billing project is the project that runs the query. It is the project where you create your service account.
The example below shows where you can find the billing project in the Google Cloud BigQuery console.
"Not yet allocated" items
BigQuery Lens may not be able to detect which table is being scanned due to the following reasons:
- Queries like
SELECT 1
- Queries we are unable to parse due to complexity (i.e., a parser time out)
- Queries that use a function or statement that we do not support yet
In such cases, you'll see Not yet allocated
items in the BigQuery Lens Explorer widget.
Slots consumption
Shows the number of slots consumed by your organization in terms of Peak usage and Average usage per Hour or Day.
The example below shows the slots assumption when the usage type is one of BigQuery editions.
Scans by table type
This widget shows the volume of data you are scanning from various types of tables, broken down between unpartitioned tables and multiple types of partitioned tables, along with external sources (e.g., Google Sheets).
You can select a table type to see the most frequently scanned tables of that type.
BigQuery spend by SKU
This widget shows a breakdown of BigQuery costs per SKU.
You can select the three dots menu (⋮) in the upper-right corner of the widget for two additional options:
-
Refresh: By default, the BigQuery Spend by SKU widget is updated twice per day. Use the Refresh option to fetch the latest billing data from Google.
NoteThe information shown in the widget is when the DoiT Platform retrieves the billing data, while the freshness of the data is decided by Google Cloud's frequency of data loads.
-
Open full report: Opens the report that generates the visualization in this widget.
What's next
-
Read our blog series on BigQuery Optimizations.
-
Check out our on-demand events.