As a data professional, it's a common requirement to understand the state of data as of a specific date, such as identifying changes made since a particular point in time. While creating a change log or implementing Change Data Capture (CDC) is a typical approach, it's often impractical to maintain such logs for all tables, especially in systems with a significant number of tables. In modern data warehouses or data systems, which often contain more than 50+ tables, maintaining change logs for many of them can become a substantial overhead. Data evolves constantly, and sometimes these changes are extensive, particularly in data warehouse environments. When debugging data issues, having the ability to view the state of the data as of a specific date can be invaluable.
BigQuery has added this great feature to solve this problem with minimal effort, with almost no additional development.
Bigquery has FOR SYSTEM_TIME AS OF that help us access data as of a particular date.
SELECT
*
FROM
`poc.bq-api-test`
LIMIT
100;
Step - 2 : Update records
UPDATE
`poc.bq-api-test`
SET
name ='test2'
WHERE
id=1;
SELECT
*
FROM
`poc.bq-api-test` FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR);
Conclusion: Using FOR SYSTEM_TIME AS OF We can travel to any past time and can see the state of data. Obviously it has some limitations; we can see the details on GCP documentation here.
Access historical data using time travel | BigQuery | Google Cloud
Here is how to restore accidently deleted Bigquery Dataset
How to restore a deleted BigQuery Dataset (derrickqin.com)
No comments:
Post a Comment