The DoiT BigQuery Lens (BQ Lens) helps you gain insights and highlight inefficiencies in your organization's Google Cloud BigQuery usage.
Set up BigQuery Lens
To access the BigQuery Lens, make sure that you have followed the steps at Connect your Google Cloud Organization to upload your Google Cloud service account and grant the appropriate permissions to the DoiT Platform.
To check the status of your service account, select the Settings icon (a gear) from the top menu bar, then select Google Cloud.
Select the BigQuery Lens feature on the screen to enable it. The same is true for BigQuery Lens Advanced.
Once the Google Cloud service account is set up, select the Attach button from the dashboard navigation bar, and then select BigQuery Lens from the list of options.
After attaching the BigQuery Lens dashboard, a BQ Lens icon will appear on your dashboard, and the DoiT Platform will start gathering information about your account usage patterns. You will see messages on the screen informing you of the progress.
The BigQuery historical jobs are backfilled for the last 30 days.
Once activated, the DoiT Platform will populate the BigQuery Lens dashboard with the statistics and recommendations.
The BigQuery Lens dashboard consists of several widgets that highlight different aspects of your BigQuery usage.
Before diving into the widgets, have a look at the options provided in the two dropdown lists above the widgets area:
- Time range: Adjusts the time frame of the data you're looking at. Options: Last 30 days, Last 7 days, and Last day.
- Usage type: Determines which BigQuery projects you're looking at, based on the pricing model you're using to pay for them. Options: On-Demand, Flat-Rate.
Except BigQuery Spend by SKU, all other widgets on the BigQuery Lens dashboard are updated daily at 1:00 AM UTC.
Comprehensive recommendations with details about how to act on each suggestion.
Get more granular information as far as your team's BigQuery usage, adjustable by the user in the following format:
- On-Demand usage type: (Project, Billing Project, Dataset, Table, User) by (Scan Price, Scan TB)
- Flat-Rate usage type: (Project, Billing Project, Dataset, Table, User) by (Average Slots)
Select any of the components of the Explorer to drill down into the data.
Slots Consumption widget
The number of slots consumed by your organization in terms of Peak usage and Average usage by Day or Hour.
Scans by Table Type widget
See how much data you are scanning from various types of tables, broken down between unpartitioned tables and multiple types of partitioned tables, along with external sources (e.g., Google Sheets).
Select a table type to see the most frequently scanned tables of that type.
BigQuery Spend by SKU widget
BigQuery costs broken down per SKU.
You can select the three dots menu (⋮) in the upper-right corner of the widget for two additional options;
Refresh: By default, the BigQuery Spend by SKU widget is updated twice per day. You can use the Refresh option to fetch the latest billing data from Google.Note
The information shown here is when the DoiT Platform retrieves the billing data, while the freshness of the data is decided by Google Cloud's frequency of data loads.
Open full report: Opens the report that generates the visualization in this widget.
Below are the recommendation categories you'll see in the Recommendations widget, which vary with the Usage type you selected.
- Backup and remove unused tables: Back up and remove the tables listed under the Table column. If the table has multiple partitions, select the number under Partition(s) to Remove to see which partitions should be removed.
- Cluster your tables: Cluster the tables listed under the Table column by the fields suggested under the Cluster By column.
- Enforce Partition Fields: Use the suggested partitioned fields under the Partition Field column for the corresponding queries listed under the Query ID column.
- Partition your tables: Partition the tables listed under the Table column by the suggested fields listed under the Partition Fields column.
- Limit query jobs: Reduce execution frequency of the listed jobs under the Query ID column by the percentage you choose on the slider and view the associated savings of each reduced job under the Savings by Reducing Jobs column.
- Switch to monthly flat-rate plan: Purchase the slots amount listed in the Optimal slots amount column.
- Move repeating queries to a different time slot: Lower your minimum average slots requirements by moving the recurring queries to a less-busy hour of the day. Review the Slots Consumption widget with the interval set to
Hourto identify suitable hours. Troughs in your daily slots consumption indicate optimal hours to run recurring queries.
- Backup and remove unused tables: Back up and remove the unused tables to lower costs.
Frequently Asked Questions (FAQ)
What permissions does the BigQuery Lens need?
The Standard BigQuery Lens needs the following write permissions:
In addition to the above, BigQuery Lens Advanced needs one more write permission:
Why does BigQuery Lens Advanced need an extra permission?
bigquery.tables.getData permission provides clustering recommendations that are unique to BigQuery Lens Advanced.
It allows us to query your BigQuery tables, find your top 20 non-clustered tables, and then determine which fields are the best candidates to cluster those tables by and in what order.
More specifically, this permission helps us identify the cardinality of the columns referenced in the
WHERE clause (i.e., the columns which can be clustered on). Knowing how many distinct elements exist in each column allows us to compute the average chunk size and determine the best savings candidates.
BigQuery Lens Advanced provides recommendations based on the assumption that:
- You will continue to execute similar queries as you did in the previous 30 days, and
- You will reference fields in the order that we recommend
What datasets do you create?
A dataset named
doitintl-cmp-bq is created in the billing project attached to the service account you added to the DoiT Platform.
What sinks do you create?
A sink for query jobs is created in the billing project attached to the service account you added to the DoiT Platform. This sink pushes all your BigQuery jobs into a table under
What queries do you run?
BigQuery Lens has two main processes: a table discovery process to collect table metadata, and a dashboard update process to run queries.
The table discovery process runs on a schedule and writes the table metadata it collected into a table named
tables in the
The dashboard update process executes the following queries:
- Daily run of cost simulation queries to provide the cost savings information and potential savings.
- Daily run of lens calculation queries to provide detailed BigQuery usage.
What does "Not yet allocated" in the BigQuery Explorer mean?
You will see "Not yet allocated" when analyzing dimensions like Projects in the BigQuery Explorer when we cannot detect which table a query is scanning.
We may not be able to detect which table is being scanned because of:
- Queries like
- Queries we are unable to parse due to complexity (i.e., a parser time out), or
- Queries that use a function or statement that we do not support yet
How much does BigQuery Lens cost?
BigQuery Lens creates and maintains the
doitintl-cmp-bq dataset on your behalf and runs queries to provide you with analytics data. The cost for running the queries is usually around $10 per month.
It is worth mentioning that the actual cost depends on the volume of your logs. If your organization uses BigQuery heavily, the cost could be much higher if you are using BigQuery's on-demand analysis pricing model.
If you have an excess of historical jobs, you may see an increased cost for BigQuery Streaming Inserts when we backfill them for the last 30 days during your service account setup.
What's the difference between Project and Billing project in the Explorer widget?
The Billing project runs the query and incurs the costs.
The Project contains the data that is queried.