Hi folks, Few days back i was working on SCD Type2. SCD is slowly changing dimensions & Type2 is used for managing history. I have implemented it through SSIS package however i am also describing sql merge statement to implement it.
I have few execution segments. I am explaining the order of execution of these segments. It is necessary because i am using some segment code in SSIS as well-
SQL SERVER :
Execute Segment1 ---> Segment2 --->Segment3 --->Segment2 --->Segment4
Check result after each segment.
SSIS :
Execute Segment1 from above code.
Open BIDS or SSDT ---> Open/Create SSIS project --->Create new project
Create a connection string and drag data flow task and rename it as below :
Now drag a slowly changing dimension and connect it from stagingEmployee(Source)
Click on slowly changing dimension and one wizard will open then follow like below :
Now it will look like this :
If you will check the result then DimChecksum value not correct so here either we can remove the column as we don't need it or we can correct it with the help of some package correction.
Also you need to correct some other components as below :
Correction on OLEDB Command as below (in sql query section and no change in mapping)
Correction on Insert destination > Mapping the checksumdata and then execute Segment4
I have few execution segments. I am explaining the order of execution of these segments. It is necessary because i am using some segment code in SSIS as well-
/*Segment1 */
Create table StagingEmployee(EmployeeCode int, EmployeeName varchar(100), Address varchar(100), ChecksumData int Default -1); Insert into StagingEmployee(EmployeeCode,EmployeeName,Address) Values(1001,'Vijay','Noida'); Insert into StagingEmployee(EmployeeCode,EmployeeName,Address) Values(1002,'Manoj','Gurugram'); Insert into StagingEmployee(EmployeeCode,EmployeeName,Address) Values(1003,'Jai','Delhi'); Update StagingEmployee Set ChecksumData=BINARY_CHECKSUM(EmployeeCode,EmployeeName,Address); Create table dimEmployee(Employeeid int identity(1,1),Employeecode int,EmployeeName varchar(100), Address varchar(100),CreatedDate datetime null Default Getdate(),ExpirationDate datetime null, CurrentRecord char(1) Default 'Y',DimChecksum int Default -1); Select * from StagingEmployee Select * from dimEmployee /*Segment2 */
Insert into dimEmployee(EmployeeCode,EmployeeName,Address,CreatedDate,ExpirationDate,DimChecksum) Select EmployeeCode,EmployeeName,Address,CreatedDate,ExpirationDate,DimChecksum From ( Merge into dimEmployee as T Using StagingEmployee as S ON T.EmployeeCode=S.EmployeeCode When Matched and T.DimChecksum<>S.ChecksumData and T.CurrentRecord='Y' Then Update Set ExpirationDate=GetDate()-1, CurrentRecord= 'N' When Not Matched Then Insert(EmployeeCode,EmployeeName,Address,DimChecksum) Values(S.EmployeeCode,S.EmployeeName,S.Address,S.ChecksumData) Output $action,S.EmployeeCode,S.EmployeeName,S.Address,GetDate(),Null,S.ChecksumData ) as Tab (action,EmployeeCode,EmployeeName,Address,CreatedDate,ExpirationDate,DimChecksum) where action='UPDATE'; Select * from StagingEmployee Select * from dimEmployee /*Segment3 */
Truncate table StagingEmployee; Insert into StagingEmployee(EmployeeCode,EmployeeName,Address) Values(1001,'Vijay','Pune'); Insert into StagingEmployee(EmployeeCode,EmployeeName,Address) Values(1005,'Varun','Mumbai'); Update StagingEmployee Set ChecksumData=BINARY_CHECKSUM(EmployeeCode,EmployeeName,Address); /*Segment4 */ -- <----Execute2 code
Select * from StagingEmployee Select * from dimEmployee /* Truncate table StagingEmployee; Truncate table dimEmployee; Drop table StagingEmployee; Drop table dimEmployee; */
SQL SERVER :
Execute Segment1 ---> Segment2 --->Segment3 --->Segment2 --->Segment4
Check result after each segment.
Now execute Segment4
SSIS :
Execute Segment1 from above code.
Open BIDS or SSDT ---> Open/Create SSIS project --->Create new project
Create a connection string and drag data flow task and rename it as below :
Click on SCD Type2 (Control flow task) --> drag a source assistant and connect with table StagingEmplohyee and name it StagingEmployee
Now drag a slowly changing dimension and connect it from stagingEmployee(Source)
Click on slowly changing dimension and one wizard will open then follow like below :
Now it will look like this :
Now run the package and try the result in sql server management studio.
If you will check the result then DimChecksum value not correct so here either we can remove the column as we don't need it or we can correct it with the help of some package correction.
Also you need to correct some other components as below :
Correction on OLEDB Command as below (in sql query section and no change in mapping)
Correction on Insert destination > Mapping the checksumdata and then execute Segment4
Now Execute Segment1 ---> Run the package see the results
Now Execute Segment3 ---> run the package and check the results and then Execugte Segment4