What is Google's super-fast data warehouse, BigQuery?

This is Ohara from the China office

This time, we will focus on BigQuery, a fully managed data warehouse provided by Google

BigQuery | Google Cloud Platform — Google Cloud Platform

First of all, what is BigQuery?

BigQuery is a big data analysis service provided by Google, and was officially announced as a service at Google I/O (a developer event hosted by Google) in 2012

Originally, there was a data analysis system called Dremel that was used within Google, which was then improved for external users and made available as a service

Japanese system vendors also offer a wide range of services, including big data analysis services and software, but BigQuery runs SQL-like queries on data sets of several TB (terabytes) or even PB (petabytes), processing them in just a few seconds or even a few tens of seconds and returning search results

How BigQuery is fast

BigQuery is fast because it uses the following two mechanisms:

Column-structured data store

Traditional RDBS stores data row by row, while record-oriented (= row-oriented) stores the entire record in the same storage

However, with column-oriented databases, a single record is divided into columns and placed in separate storage, which minimizes traffic and allows for high-compression data storage, enabling high-speed data lookup when executing queries

○ Traditional RDBS's "record = row-oriented"
○ BigQuery's "column = column-oriented"

*Source: Dremel: Interactive Analysis of Web-Scale Datasets

Tree Architecture

BigQuery has a tree-based distributed processing structure

The root server receives queries from clients, passes them through the intermediate servers directly below, and the leaf servers execute the query processing, processing the column-oriented data in parallel, quickly aggregating the results and providing the query results.
(There is also information that results can be obtained in a few seconds even for huge amounts of data on the petabyte scale, such as 500 million to 1 billion rows.)

○ Column structure datastore
○ Tree architecture

*Source: Dremel: Interactive Analysis of Web-Scale Datasets

The above two points are the reasons why BigQuery is fast

Curious about the price?

However, even if you use BigQuery, the cost is still a concern, so I have put together a brief summary

BigQuery's pricing structure will undergo significant changes from 2023 onwards, and will consist of two components: on-demand pricing and capacity pricing. This article lists the prices for the Tokyo region

● On-demand pricing = $7.5 (per TiB)
- Charges are based on the number of bytes processed for each query on BigQuery.
Up to 1 TiB of query data per month is free.

●Capacity fee = $0.051 (for Standard Edition)
・Charges are incurred for query processing capacity (per slot (virtual CPU)).
*Prices vary depending on the edition.

For more information, please see the official BigQuery pricing website

summary

It's cheap, so why not give it a try? (If you have a Google account, you can get started right away.)

▼ For details on BigQuery services, click here ▼
https://cloud.google.com/bigquery/?hl=ja

If you want to talk to a cloud professional

Since our founding, Beyond has used the technical capabilities we have cultivated as a multi-cloud integrator and managed service provider (MSP) to design, build, and migrate systems using a variety of cloud server platforms, including AWS, GCP, Azure, and Oracle Cloud

We provide a custom-made cloud server environment optimized for our customers based on the specifications and functions of the systems and applications they require, so if you are interested in the cloud, please feel free to contact us

● Cloud / Server design and construction
● Cloud / Server migration
● Cloud / Server operation, maintenance and monitoring (24 hours a day, 365 days a year)

If you found this article useful, please click [Like]!
1
Loading...
1 vote, average: 1.00 / 11
3,181
X Facebook Hatena Bookmark pocket

The person who wrote this article

About the author

Ohara

He started his career in the telecommunications industry as a salesperson responsible for the implementation of IT products such as corporate network services, office equipment, and groupware

He then worked at a system integrator-affiliated data center company as a pre-sales engineer for physical servers and hosting services, and as a customer engineer for SaaS-based SFA/CRM and B2B e-commerce, before joining Beyond, where he currently works

I am currently stationed in China (Shenzhen) and my daily routine is watching Chinese dramas and Billbill

Qualifications: Bookkeeping Level 2