Friday 1 February 2013

Understanding of Transformer Looping & Caching


 
Understanding of Transformer Looping & Caching:-

 Power of transformer looping is introduced from IIS Data stage 8.5 and above versions. Transformer looping is made developers life easy in developing complex scenarios which was almost difficult in earlier versions.

 loop variables:-

You can use loop variables when a loop condition is defined for the Transformer stage. When a loop is defined, the Transformer stage can output multiple rows for every row input to the stage. Loop variables are evaluated every time that the loop is iterated, and so can change their value for every output row. Such variables are accessible only from the Transformer stage in which they are declared. You cannot use a loop variable in a stage variable derivation.

Loop variables can be used as follows:

·                     They can be assigned values by expressions.

·                     They can be used in expressions which define an output column derivation.

·                     Expressions evaluating a variable can include other loop variables or stage variables             or the variable being evaluated itself.

 

@ITERATION: - System variable used for transformer looping

 What is Caching in the Transformer?

 The Transformer Cache is an in-memory storage mechanism that is available from within the Transformer stage and is used to help solve complex data integration scenarios. The cache is a first-in/first-out (i.e. FIFO) construct and is accessible to the developer via two new functions:

 
  SaveInputRecord: stores an input row to back of the cache

  GetInputRecord: retrieves a saved row from the front of the cache

 
I will try to explain the transformer caching and looping using below scenario:-

 

Below is my input data:-

 

Customer 
 Store
 Items
Value
600785011
103
10
100
600785011
104
20
200
600785011
109
30
300


Output:-

 

For each customer get a list of the stores he shopped, with this list output every combination of stores dropping customer from the output.

 

Home Store
 Items
 Value
Away Store
Items
Value
103
10
100
103
10
100
103
10
100
104
20
200
103
10
100
109
30
300
104
20
200
103
10
100
104
20
200
104
20
200
104
20
200
109
30
300
109
30
300
103
10
100
109
30
300
104
20
200
109
30
300
109
30
300

 

Below is Job Design:-

  SeqFile -------> TRFM1-------> TRFM2--------->TempDataset
 

 
TSFM1:-

 Define below stage variables in transformer:-

 NumSavedRows= SaveInputRecord

IsBreak= LastRowInGroup(to_tsfm1.Customer)

svCurrStore= If IsBreak Then to_tsfm1.Store:"|": svPrevStore Else to_tsfm1.Store

svPrevStore= If IsBreak Then to_tsfm1.Items else  to_tsfm1.Items :"|": svPrevItem svCurrItem= If IsBreak Then to_tsfm1.Items :"|": svPrevItem Else to_tsfm1.Items

svPrevItem= If IsBreak Then to_tsfm1.Items else  to_tsfm1.Items :"|": svPrevItem

svCurrValue= If IsBreak Then to_tsfm1.Value :"|": svPrevValue Else to_tsfm1.Values

svPrevValue= If IsBreak Then to_tsfm1.Value else  to_tsfm1.Value :"|": svPrevValue

NumRows= If IsBreak Then NumSavedRows Else 0          

 Define below loop conditions & Derivation:-

                       @ITERATION <= NumRows

OutputRecord= GetSavedInputRecord()

 
Output Derivations:-

Store=Inputlink.Store

Item=InputLink.Item

Value=InputLink.Value

ConcatStore=svCurrStore

ConcatItem=svCurrItem

ConcatValue=svCurrValue

Customer=InputCustomer
 

The logic of this transformer is as follows:

1 input data is read and stored in the cache via the "SaveInputRecord" function.
2.key breaks are tested via the "LastRowInGroup.
3.svCurrStore, svCurrItem, svCurrValue will concatenate with pipl all stores,Items & values in the same group.
4.if there is a key break, the " NumRows " variable is set to the number of records on the cache and we'll skip to step 6
5.if there is no key break(NumRows=0) the Loop Condition will be false.  This will cause the next input record to be read and processed .
6.finally, the transformer will loop while "@ITERATION <= NumRows " and read the cache records.

 
TSFM2:-
Define below Stage Variable:-

                         RowNum=Count(InputLink1.ConcatStore,”|”)

 Define below loop conditions :-

                          @ITERATION <= RowNum

 Output Derivation:-

Store=InputLink1.Store

Item=InputLink.Item

Value=InputLink.Value

AwayStore= Field(to_tsfm2.ConcatStore,"|",@ITERATION)

Item= Field(to_tsfm2.ConcatItems,"|",@ITERATION)

Value= Field(to_tsfm2.ConcatValue,"|",@ITERATION)

 
 

Note:- I have just tried to explain better understanding of Transformer looping using this scenario in my blog,may be it can be done in different logics.I will appriciate if you can bring other logics too.
 
Thanks
Prasoon

 




3 comments:

  1. hi. i have a scenario where I should implement the carry forward logic. Consider the below example.
    Account_number Balance Transcation_date
    -------------- ------- ----------------
    1 500 01-May-2013
    1 100 10-May-2013
    1 1000 15-May-2013
    I need the output as below:
    Account_number Balance Transaction_date
    -------------- ------- ----------------
    1 500 01-May-2013
    1 500 02-May-2013
    1 500 03-May-2013
    '
    '
    '1 100 10-May-2013
    1 100 11-May-2013
    '
    '
    1 1000 15-May-2013

    Please someone help me in finding the logic using transformer looping concept.
    Thanks in advance.

    ReplyDelete
  2. It helped me a lot! Thank you!

    ReplyDelete