Skip to main content

BigQuery Lens

DoiT Cloud Navigator: Enhanced and above

Google Cloud BigQuery is an enterprise data warehouse that works across clouds. It's widely used to store, analyze, and visualize data and make data-driven decisions. DoiT BigQuery Lens (BQ Lens) helps you gain insights of your BigQuery usage and optimize your query performance.

Required permission

To access BigQuery Lens, your DoiT account needs the Cloud Analytics permission.

Before you begin

Be aware of the following:

Set up BigQuery Lens

To set up BigQuery Lens:

  1. Connect your Google Cloud Organization to the DoiT Platform, if you haven't done so.

  2. Update your service account to grant the permissions required by BigQuery Lens and optionally, BigQuery Lens Advanced.

For BigQuery Lens Advanced, you also need to enable Google Cloud Resource Manager API, which manages metadata for Google Cloud Platform resource containers:

  1. Log in to the Google Cloud console, go to the Cloud Resource Manager API page.

  2. Select the project where you created your service account.

  3. Select Enable.

Once the BigQuery Lens is set up, the DoiT Platform will backfill historical data for the last 30 days and start gathering information about your Google Cloud BigQuery usage patterns. It can take up to 24 hours to fully populate the BigQuery Lens dashboard with statistics and recommendations.

Update permissions

Beginning December 18, 2023, the BigQuery Lens queries the INFORMATION_SCHEMA.TABLES view, which needs two more organization-level permissions: bigquery.routines.get and bigquery.routines.list.

If you have activated BigQuery Lens before December 18, 2023, run the following gcloud CLI commands to continue using it:

  1. Get your Google IAM organization resource ID.

    ORG_ID=$(gcloud organizations list --format 'value(ID)')
  2. Update the IAM custom role doit_cmp_role.

    gcloud iam roles update doit_cmp_role --organization $ORG_ID \
    --add-permissions bigquery.routines.get,bigquery.routines.list

BigQuery Lens dashboard

The BigQuery Lens dashboard consists of widgets that highlight different aspects of your BigQuery usage.

The Google Cloud BigQuery Lens dashboard

General settings:

  • Time range: Adjusts the time frame of the data you're looking at. Options: Last 30 days, Last 7 days, and Last day.

  • Usage type (available only if you have Flat-Rate covered usage): Selects the BigQuery projects based on their pricing model. Options: On-Demand, Flat-Rate. (We're currently working on the support of the capacity compute pricing model offered in BigQuery editions.)

Widgets

Note

Except BigQuery Spend by SKU, all other widgets on the BigQuery Lens dashboard are updated daily at 01:00 AM UTC.

Recommendations

The Recommendations widget shows actionable insights to BigQuery optimizations in accordance with the Usage type. You can select each individual recommendation to see the details.

Common recommendation

  • Backup and remove unused tables: This recommendation applies to both usage types. It lists the unused tables that you can back up and then remove from BigQuery to reduce storage costs.

    Below is an example:

    Google Cloud BigQuery On-demand Recommendation

    • The Project column shows the billing project. See Project and Billing project.

    • Select a table name in the Table column to open it in your Google Cloud BigQuery console.

    • If the table has multiple partitions, select the number in the Partition(s) to Remove column to see which partitions should be removed.

    • You can also download the full list as a CSV file.

On-demand recommendation

  • Cluster your tables optimally: Cluster tables using the suggested Cluster By fields. This action helps to eliminate unnecessary scans.

  • Enforce Partition Fields: Use the suggested Partition Field in queries indicated by the Query ID.

  • Limit query jobs: Reduce execution frequency of the listed jobs in the Query ID column by the percentage you choose on the slider and view the Savings by Reducing Jobs.

  • Partition your tables: Use the suggested Partition Fields to convert non-partitioned tables to partitioned ones. BigQuery Lens does not give recommendations for tables that are already partitioned.

  • Change dataset to Physical Storage: Change a dataset's storage billing model to pay for the compressed physical storage used by tables in the dataset.

    You can open the recommendation link to view the detailed information on datasets that could benefit from using the physical storage billing model, including the project, table, logical storage size, physical storage size, logical storage cost, physical storage cost, and potential savings.

    Google Cloud BigQuery Physical Storage Recommendation

    • The Project column shows the billing project. See Project and Billing project.

    • Select a table name in the Table column to open it in your Google Cloud BigQuery console.

    • You can also download the full list as a CSV file.

    Note

    Make sure to review Google's physical storage documentation for eligibility criteria. The BigQuery Lens recommendation doesn't perform eligibility check.

Flat-rate recommendation

  • Move repeating queries to a different time slot: Lower your minimum average slots requirements by moving the recurring queries to a less-busy hour of the day. Review the Slots Consumption widget with the interval set to Hour to identify suitable hours. Troughs in your daily slots consumption indicate optimal hours to run recurring queries.

Explorer

Use the Explore widget to get granular information about your BigQuery usage. You can select individual entries to drill down into the data.

Google Cloud BigQuery Lens Explorer

The metrics available vary with the scope and usage type.

Usage typeScopeMetric
On-DemandBilling Project, UserScan Price, Scan TB
On-DemandProject, Dataset, TableScan Price, Scan TB, Storage Price, Storage TB
Flat-RateBilling Project, UserAverage Slots
Flat-RateProject, Dataset, TableStorage Price, Storage TB

The example below shows the scan price of a chosen user. You can also select the individual queries run by the user to view the price.

BigQuery scan price of a user

Project and Billing project

  • The Project contains the data that is queried during the job. It is called data project in the BigQuery audit logs overview.

  • The Billing project is the project that runs the query. It is the project where you create your service account.

    The example below shows where you can find the billing project in the Google Cloud BigQuery console.

    Google Cloud BigQuery billing project

"Not yet allocated" items

BigQuery Lens may not be able to detect which table is being scanned due to the following reasons:

  • Queries like SELECT 1
  • Queries we are unable to parse due to complexity (i.e., a parser time out)
  • Queries that use a function or statement that we do not support yet

In such cases, you'll see Not yet allocated items in the BigQuery Lens Explorer widget.

Slots consumption

Shows the number of slots consumed by your organization in terms of Peak usage and Average usage per Hour or Day.

The example below shows the slots assumption when the usage type is set to Flat-Rate.

Google Cloud BigQuery Slots Consumption

Scans by table type

This widget shows the volume of data you are scanning from various types of tables, broken down between unpartitioned tables and multiple types of partitioned tables, along with external sources (e.g., Google Sheets).

You can select a table type to see the most frequently scanned tables of that type.

Google Cloud BigQuery scans by table type

BigQuery spend by SKU

This widget shows a breakdown of BigQuery costs per SKU.

Google Cloud BigQuery spend by SKU

You can select the three dots menu () in the upper-right corner of the widget for two additional options:

  • Refresh: By default, the BigQuery Spend by SKU widget is updated twice per day. Use the Refresh option to fetch the latest billing data from Google.

    Note

    The information shown in the widget is when the DoiT Platform retrieves the billing data, while the freshness of the data is decided by Google Cloud's frequency of data loads.

  • Open full report: Opens the report that generates the visualization in this widget.

FAQ

Why does BigQuery Lens Advanced need extra permissions?

The BigQuery Lens Advanced feature needs extra permissions to get advanced clustering recommendations and fetch BigQuery reservations information for BigQuery editions when applicable:

  • bigquery.tables.getData: To query table data, identify the top 20 non-clustered tables, and determine the suitable fields and the most efficient order for clustering the tables. This permission helps us identify the cardinality of the columns referenced in the WHERE clause (i.e., the columns that can be clustered on). Knowing how many distinct elements exist in each column allows us to compute the average chunk size and determine the best savings candidates.

  • bigquery.reservations.list: To query the INFORMATION_SCHEMA.RESERVATIONS view for a list of all current reservations in a project.

  • bigquery.reservationAssignments.list: To query the INFORMATION_SCHEMA.ASSIGNMENTS view for all reservation assignments in a project.

  • bigquery.capacityCommitments.list: To query the INFORMATION_SCHEMA.CAPACITY_COMMITMENTS view for all current capacity commitments in a project.

Tip

BigQuery Lens Advanced provides recommendations based on the following assumptions:

  • You'll continue to execute similar queries as you did in the previous 30 days, and
  • You'll reference fields in the order that we recommend.

What datasets and sinks do you create?

We create the following dataset and sink in your billing project:

  • A dataset named doitintl-cmp-bq.

  • An audit log sink named <Billing_Project_ID>.doitintl-cmp-bq.cloudaudit_googleapis_com_data_access for all your BigQuery jobs.

The dataset doitintl-cmp-bq is created when you update your service account with the required permissions and removed when you remove the service account.

What queries do you run?

BigQuery Lens has two main processes: a table discovery process to collect table metadata, and a dashboard update process to run queries.

  • The table discovery process runs on a schedule and writes the table metadata it collected into a table named tables in the doitintl-cmp-bq dataset.

  • The dashboard update process executes the following queries:

    • Daily run of cost simulation queries to provide the cost savings information and potential savings.
    • Daily run of lens calculation queries to provide detailed BigQuery usage.

How much does BigQuery Lens cost?

BigQuery Lens creates and maintains the doitintl-cmp-bq dataset on your behalf and runs queries to provide you with analytics data. The cost for running the queries is usually around $10 per month.

It is worth mentioning that the actual cost depends on the volume of your logs. If your organization uses BigQuery heavily, the cost could be much higher if you are using BigQuery's on-demand analysis pricing model.

Caution

If you have an excess of historical jobs, you may see an increased cost for BigQuery Streaming Inserts when we backfill them for the last 30 days during your service account setup.

What's next