Skip to main content

Big Query (bq cli)

This guide shows how to create a layer of control on top of the bq command line

Requirements

Big Query Setup

  1. Setup the Bigquery following the topics bellow
  1. Create a service account with the following permissions:
  • BigQuery User
  • BigQuery Data Viewer
  1. Create a JSON type service account key and store in a safe place

For the sake of this guide, we're following the gcp guide to interact with bigquery.

info

In case you already have a testing dataset, you can skip this part

Reference: https://cloud.google.com/iam/docs/creating-managing-service-accounts

Hoop Setup

In your local machine, start hoop

hoop start

Interacting with Datasets

Create the bq query query Connection



info

Add the JSON service account into the KEY_FILE input

This will allow interacting with the bq command line

  1. Try to execute a query
hoop exec -v bqquery -i 'SELECT * FROM babynames.names_2014 LIMIT 1'

Interacting with Jobs

Create a bq head connection



  1. Try to fetch it with bq head -j <session-id>

The previous command was adding the session id as the job id of executed queries on bigquery, thus we could use the previous session id to fetch it.

hoop exec bqhead -- -j <PREVIOUS_SESSION_ID>

More Patterns

The bq command line has other actions that could be used as patterns with Hoop.

Allow jobs to be async

$ bq query --use_legacy_sql=false --nosynchronous_mode

This allow the pattern of consulting the previous job with bq head --job_id .... It's useful for queries that take too long to execute.

Dry-run commands

Dry run allows showing how much of bytes will be consumed

$ bq query --use_legacy_sql=false 'SELECT * FROM babynames.names_2014 limit 5' --dry_run
Query successfully validated. Assuming (...), running this query will process 654791 bytes of data.

A plugin could allow queries that has passed by the --dry-run mode.

Command Line Quota

bq allows specifying the maximum bytes billed, this option could be used as a retriction mechanism

$ bq query \
--use_legacy_sql=false \
--maximum_bytes_billed 1 \
'SELECT * FROM babynames.names_2014 limit 100;'

BigQuery error in query operation: Error processing job
'myproject:bqjob_r107f63c4f31d3d37_000001860d50d5f8_1': Query exceeded
limit for bytes billed: 1. 10485760 or higher required.