What is Staging area why we need it in DWH?
If target and source databases are different
and target table volume is high it contains some millions of records in this
scenario without staging table we need to design your informatica using look
up to find out whether the record exists or not in the target table since
target has huge volumes so its costly to create cache it will hit the
performance.
If we create staging tables in the target
database we can simply do outer join in the source qualifier to
determine insert/update this approach will give you good performance.
It will avoid full table scan to determine
insert/updates on target, and also we can create index on staging tables since
these tables were designed for specific application it will not impact to any
other schemas/users.
While processing flat files to data
warehousing we can perform cleansing.
Data cleansing, also known as data scrubbing, is the process of ensuring that a set of
data is correct and accurate. During data cleansing, records are checked for
accuracy and consistency.
• Since it is one-to-one mapping from ODS to staging we do truncate and reload.
• We can create indexes in the staging state, to perform our source qualifier best.
• We can create indexes in the staging state, to perform our source qualifier best.
• If we have the staging area no need to relay on the informatics transformation
to known whether the record exists or not.
to known whether the record exists or not.
Data Cleansing:
Weeding out unnecessary or unwanted things
(characters and spaces etc.) from incoming data to make it more meaningful and
informative.
Data Merging:
Data
can be gathered from heterogeneous systems and put together.
Data Scrubbing:
Data scrubbing is the process of fixing or eliminating individual pieces of data that are incorrect, incomplete or duplicated before the data is passed to end user.
Data scrubbing is aimed at more than eliminating errors and redundancy. The goal is also to bring consistency to various data sets that may have been created with different, incompatible business rules.
Data scrubbing is aimed at more than eliminating errors and redundancy. The goal is also to bring consistency to various data sets that may have been created with different, incompatible business rules.
No comments:
Post a Comment