What is Google's super-fast data warehouse, BigQuery?

table of contents
This is Ohara from the China office
This time, we will focus on BigQuery, a fully managed data warehouse provided by Google
BigQuery | Google Cloud Platform — Google Cloud Platform
First of all, what is BigQuery?
BigQuery is a big data analysis service provided by Google, and was officially announced as a service at Google I/O (a developer event hosted by Google) in 2012
Originally, there was a data analysis system called Dremel that was used within Google, which was then improved for external users and made available as a service
Japanese system vendors also offer a wide range of services, including big data analysis services and software, but BigQuery runs SQL-like queries on data sets of several TB (terabytes) or even PB (petabytes), processing them in just a few seconds or even a few tens of seconds and returning search results
How BigQuery is fast
BigQuery is fast because it uses the following two mechanisms:
Column-structured data store
Traditional RDBS stores data row by row, while record-oriented (= row-oriented) stores the entire record in the same storage
However, with column-oriented databases, a single record is divided into columns and placed in separate storage, which minimizes traffic and allows for high-compression data storage, enabling high-speed data lookup when executing queries
○ Traditional RDBMS: "record-oriented = row-oriented"
○ BigQuery: "column-oriented = column-oriented"
* Source of information:Dremel: Interactive Analysis of Web-Scale Datasets
Tree Architecture
BigQuery has a tree-based distributed processing structure
The root server receives queries from clients, passes them through the intermediate servers directly below it, and the leaf servers execute the query processing. This parallel processing of the column-oriented data described above allows for rapid aggregation of the results read and output of the query results.
(There are reports that even with massive amounts of data, such as 500 million to over 1 billion rows, results can be obtained in just a few seconds.)
○ Columnar structure data
○ Tree architecture
* Source of information:Dremel: Interactive Analysis of Web-Scale Datasets
The above two points are the reasons why BigQuery is fast
Curious about the price?
However, even if you use BigQuery, the cost is still a concern, so I have put together a brief summary
BigQuery's pricing structure will undergo significant changes from 2023 onwards, and will consist of two components: on-demand pricing and capacity pricing. This article lists the prices for the Tokyo region
● On-demand pricing = $7.5 (per TiB)
based on the number of bytes processed by each query on BigQuery.
is charged
●Capacity fee = $0.051 (for Standard Edition)
- Charges are incurred based on query processing capacity (per slot (virtual CPU)).
*Prices vary depending on the edition.
For more details,the official BigQuery pricing websiteplease visit
summary
It's cheap, so why not give it a try? (If you have a Google account, you can get started right away.)
▼ For more details about BigQuery, click here ▼
https://cloud.google.com/bigquery/?hl=ja
If you want to talk to a cloud professional
Since our founding, Beyond has used the technical capabilities we have cultivated as a multi-cloud integrator and managed service provider (MSP) to design, build, and migrate systems using a variety of cloud server platforms, including AWS, GCP, Azure, and Oracle Cloud
We provide a custom-made cloud server environment optimized for our customers based on the specifications and functions of the systems and applications they require, so if you are interested in the cloud, please feel free to contact us
● Cloud/Server design and construction
● Cloud/Server migration
● Cloud/Server operation, maintenance, and monitoring (24 hours a day, 365 days a year)
2
