BigQuery Lens
DoiT BigQuery Lens (BQ Lens) helps you gain insights of your Google Cloud BigQuery usage and optimize your query performance.
- Cloud Analytics
Before you begin
BigQuery Lens needs permissions at the organization level. See DoiT Security and data access policy: BigQuery Lens.
BigQuery Lens will set up an audit log sink for your BigQuery jobs. See What datasets and sinks do you create.
The BigQuery historical jobs are backfilled for the last 30 days.
Set up BigQuery Lens
To set up BigQuery Lens:
Connect your Google Cloud Organization to the DoiT Platform, if you haven't done so.
Update your service account to grant the permissions required by BigQuery Lens and optionally, BigQuery Lens Advanced.
Once the BigQuery Lens is set up, the DoiT Platform will start gathering information about your Google Cloud BigQuery usage patterns and populating the dashboard with statistics and recommendations.
BigQuery Lens dashboard
The BigQuery Lens dashboard consists of several widgets that highlight different aspects of your usage.
General settings:
Time range: Adjusts the time frame of the data you're looking at. Options: Last 30 days, Last 7 days, and Last day.
Usage type (available only if you have Flat-Rate covered usage): Selects the BigQuery projects based on their pricing model. Options: On-Demand, Flat-Rate.
Widgets
Except BigQuery Spend by SKU, all other widgets on the BigQuery Lens dashboard are updated daily at 01:00 AM UTC.
Recommendations
The Recommendations widget shows actionable insights to BigQuery optimizations in accordance with the Usage type. You can select each individual recommendation to see the details.
Common recomendation
Backup and remove unused tables: This recommendation applies to both usage types. It lists the unused tables that you can back up and then remove from BigQuery to reduce your storage costs.
Below is an example:
The
Project
column shows the billing project. See Project and Billing project.Select a table name in the
Table
column to open it in your Google Cloud BigQuery console.If the table has multiple partitions, select the number in the
Partition(s) to Remove
column to see which partitions should be removed.You can also download the full list as a CSV file.
On-demand recomendation
Cluster your tables optimally: Cluster the tables listed in the
Table
column using the fields suggested in theCluster By
column. This action helps to eliminate unnecessary scans.Enforce Partition Fields: Use the suggested partitioned fields in the
Partition Field
column for the corresponding queries listed in theQuery ID
column.Limit query jobs: Reduce execution frequency of the listed jobs in the
Query ID
column by the percentage you choose on the slider and view the associated savings of each reduced job in theSavings by Reducing Jobs
column.Partition your tables: Partition the tables listed in the
Table
column by the suggested fields in thePartition Fields
column.Switch to monthly flat-rate plan: Purchase the slots amount indicated in the
Optimal slots amount
column. You can also check the BigQuery Slots Consumption widget for the context.
Flat-rate recomendation
- Move repeating queries to a different time slot: Lower your minimum average slots requirements by moving the recurring queries to a less-busy hour of the day. Review the Slots Consumption widget with the interval set to
Hour
to identify suitable hours. Troughs in your daily slots consumption indicate optimal hours to run recurring queries.
Explorer
Use the Explore widget to get granular information about your BigQuery usage. You can select individual entries to drill down into the data.
The metrics available vary with the scope and usage type.
Usage type | Scope | Metric |
---|---|---|
On-Demand | Billing Project, User | Scan Price, Scan TB |
On-Demand | Project, Dataset, Table | Scan Price, Scan TB, Storage Price, Storage TB |
Flat-Rate | Billing Project, User | Average Slots |
Flat-Rate | Project, Dataset, Table | Storage Price, Storage TB |
Below is an example to view the scan price of a chosen user. You can also select the individual queries run by the user to view it.
Project and Billing project
The Project contains the data that is queried during the job. It's called data project in the BigQuery audit logs overview.
The Billing project is the project that runs the query. It is the project where you create your service account.
The example below shows where you can find the billing project in the Google Cloud BigQuery console.
"Not yet allocated" items
BigQuery Lens may not be able to detect which table is being scanned due to the following reasons:
- Queries like
SELECT 1
- Queries we are unable to parse due to complexity (i.e., a parser time out)
- Queries that use a function or statement that we do not support yet
In such cases, you'll see Not yet allocated
items in the BigQuery Lens Explorer widget.
Slots consumption
Shows the number of slots consumed by your organization in terms of Peak usage and Average usage per Hour or Day.
The example below shows the slots assumption when the usage type is set to Flat-Rate.
Scans by table type
This widget shows the volume of data you are scanning from various types of tables, broken down between unpartitioned tables and multiple types of partitioned tables, along with external sources (e.g., Google Sheets).
Select a table type to see the most frequently scanned tables of that type.
BigQuery spend by SKU
This widget shows the BigQuery costs breakdown per SKU.
You can select the three dots menu (⋮) in the upper-right corner of the widget for two additional options;
Refresh: By default, the BigQuery Spend by SKU widget is updated twice per day. Use the Refresh option to fetch the latest billing data from Google.
NoteThe information shown here is when the DoiT Platform retrieves the billing data, while the freshness of the data is decided by Google Cloud's frequency of data loads.
Open full report: Opens the report that generates the visualization in this widget.
FAQ
Why does BigQuery Lens Advanced need an extra permission?
The bigquery.tables.getData
permission provides clustering recommendations that are unique to BigQuery Lens Advanced.
It allows us to query your BigQuery tables, find your top 20 non-clustered tables, and then determine which fields are the best candidates to cluster those tables by and in what order.
More specifically, this permission helps us identify the cardinality of the columns referenced in the WHERE
clause (i.e., the columns which can be clustered on). Knowing how many distinct elements exist in each column allows us to compute the average chunk size and determine the best savings candidates.
BigQuery Lens Advanced provides recommendations based on the assumption that:
- You'll continue to execute similar queries as you did in the previous 30 days, and
- You'll reference fields in the order that we recommend.
What datasets and sinks do you create?
We create the following dataset and sink in your billing project:
A dataset named
doitintl-cmp-bq
.An audit log sink named
<Billing_Project_ID>.doitintl-cmp-bq.cloudaudit_googleapis_com_data_access
for all your BigQuery jobs.
What queries do you run?
BigQuery Lens has two main processes: a table discovery process to collect table metadata, and a dashboard update process to run queries.
The table discovery process runs on a schedule and writes the table metadata it collected into a table named
tables
in thedoitintl-cmp-bq
dataset.The dashboard update process executes the following queries:
- Daily run of cost simulation queries to provide the cost savings information and potential savings.
- Daily run of lens calculation queries to provide detailed BigQuery usage.
How much does BigQuery Lens cost?
BigQuery Lens creates and maintains the doitintl-cmp-bq
dataset on your behalf and runs queries to provide you with analytics data. The cost for running the queries is usually around $10 per month.
It is worth mentioning that the actual cost depends on the volume of your logs. If your organization uses BigQuery heavily, the cost could be much higher if you are using BigQuery's on-demand analysis pricing model.
If you have an excess of historical jobs, you may see an increased cost for BigQuery Streaming Inserts when we backfill them for the last 30 days during your service account setup.
What's next
Read our blog series on BigQuery Optimizations.
Learn more about The BigQuery Autoscaling Public Preview Rundown (DoiT Edition).
Check out our on-demand events.