Incremental sync from the dedicated Redshift cluster into another warehouse can be accomplished by making use of the fact and dimension tables. It cannot be done directly from the consolidated data sources. If consolidated data sources are required in the warehouse, the business can perform incremental sync using the fact and dimension tables and then use the SQLs to generate the consolidated data sources from the facts & dimensions. This incremental sync can be done once a day during overnight hours. It cannot be done multiple times a day.
The following are the steps to perform incremental sync of the fact and dimension tables:
Do a one-time full load sync of the data.
From then onwards, note down the time stamp that the last sync was initiated and use that for the next incremental cycle.
Before pulling incremental load, the records that have been reprocessed since the last sync have to be deleted. To do this:
a. Identify all the PKs that got updated in the Bi_DimFactChanges table since the time the last sync was done. Use the etlcreated timestamp to identify the updates.
b. Identify all the PKs that got updated in the fact and dimension tables in the dedicated Redshift since the time the last sync was done. Use the etlcreated timestamp in each table to identify the updates.
c. Compile all the distinct PKs from steps a and b and delete these records from the tables in your warehouse.
Next, pull in the incremental load by identifying all the PKs that got updated in the fact and dimension tables in the dedicated Redshift since the time the last sync was done. Use the etlcreated timestamp in each table to identify the updates. Append these records to the tables in your warehouse. Note: Records are never deleted from dimension tables, they are only updated.
Maintain a log of the timestamps (in UTC) when each sync is initiated to help identify the delta load for the next sync.
Use this table to see how to join each fact to the Bi_DimFactChanges table.
Column from Fact to be joined with Bi_DimFactChanges (factpk)