As a data person it's quite a common requirement to know the as of date data state. Like to see data as of date what has been changed since a particular date. Generally we create a change log or CDC but it's not practically possible to maintain the change log for all the tables. Nowadays I can’t imagine a data warehouse of a data system with less than 50 tables maintaining a change log for many tables is extra overhead. Data keeps changing over time. Sometimes there are huge changes especially in the data warehouse system. Especially when you have to debug some data issue knowing the as of date data state this feature will be quite helpful.
BigQuery has added this great feature to solve this problem with minimal effort, with almost no additional development.
Bigquery has FOR SYSTEM_TIME AS OF that help us access data as of a particular date.
SELECT
*
FROM
`poc.bq-api-test`
LIMIT
100;
Step - 2 : Update records
UPDATE
`poc.bq-api-test`
SET
name ='test2'
WHERE
id=1;
SELECT
*
FROM
`poc.bq-api-test` FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR);
Conclusion: Using FOR SYSTEM_TIME AS OF We can travel to any past time and can see the state of data. Obviously it has some limitations; we can see the details on GCP documentation here.
Access historical data using time travel | BigQuery | Google Cloud
Here is how to restore accidently deleted Bigquery Dataset
How to restore a deleted BigQuery Dataset (derrickqin.com)
No comments:
Post a Comment