Joined: 02 Nov 2010
|Posted: Tue Nov 02, 2010 1:55 pm Post subject: data Cleansing-standarizing column,s data
|i have converted many text,acess,acess,oracle data to sql now i have to perform data cleansing.
i ,m new to SSIS and i don`t know how to use SSIS to perform my data cleansing before ETL. i have used some queries in my SQL to check dirty data,
for gender i found 8 values with (00,11,01,10,001,M) .
similarly, for date of birth
12 invalid dates. 22-Jal-75,1/27/75,27-Apl-77..... 29-Feb-75,31-jun-89 etc.
WHAT I WANT????
1)first SSIS read data.
2)if don`t find 0 or 1 in gender send erronous data(00,11,01,10,001,M) to "error table".
3)in "error table", dirty data (00,11,01,10,001,M) should be converted in 0,1
4)after standarization, error table data records should go to original table at its place.
5)Now i will change 0 to male and 1 to female.
6)now i will send complete table of student(stid,name,father,gender,adress,last degree,reg data,dob) to a new table STD_INFO.
7) i will repeat same 1-6 steps for all databases(lahore,karachi,islamabad,peshawar) and then i will integrate STD_INFO tables of all databases into one.
i hope now you can understand what i have to do. Similar case with DOB first dirty data to erronous table after conversion there to original table and then i will check all records at the same time.