Tuesday 16 May 2023

Access Historic data as of time travel in BQ



As a data person it's quite a common requirement to know the as of date data state. Like to see data as of date what has been changed since a particular date. Generally we create a change log or CDC but it's not practically possible to maintain the  change log for all the tables. Nowadays I can’t imagine a data warehouse of a data system with less than 50 tables maintaining a change log for many tables is extra overhead. Data keeps changing over time. Sometimes there are huge changes especially in the data warehouse system. Especially when you have to debug some data issue knowing the as of date data state this feature will be quite helpful.


BigQuery has added this great feature to solve this problem with minimal effort, with almost no additional development. 


Bigquery has FOR SYSTEM_TIME AS OF that help us access data as of a particular date.


Step - 1:

SELECT

  *

FROM

  `poc.bq-api-test`

LIMIT

  100;





Step - 2 : Update records 


UPDATE

  `poc.bq-api-test`

SET

  name ='test2'

WHERE

  id=1;


Finally : 
We can see the data what it was before change.
Its just small example, it has capability to see default 7days pas state of data. We can use this feature to restore the table as of particular past date etc.


SELECT

  *

FROM

  `poc.bq-api-test` FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR);







Conclusion: Using FOR SYSTEM_TIME AS OF We can travel to any past time and can see the state of data. Obviously it has some limitations; we can see the details on GCP documentation here.


Access historical data using time travel  |  BigQuery  |  Google Cloud


Here is how to restore accidently deleted Bigquery Dataset 
How to restore a deleted BigQuery Dataset (derrickqin.com)



No comments:

Post a Comment

Data Mesh

  Data Mesh is a relatively new approach to managing and organizing data within organizations, especially large enterprises. It advocates fo...