Abinitio Material

Embed Size (px)

Citation preview

  • 8/11/2019 Abinitio Material

    1/11

    1

    Question Answer ===========================

    Phases vs Checkpoints

    Phases - are used to break the graph into piece

    after its competion. Phases are used to effectiCP#" disk$ parts of the appication.

    Checkpoints - created for recovery purposes. Tcan recover to the atest saved point - and reru

    %ou can have phase breaks with or without ch

    &frA new sandbo& wi have many directories' mwith e&tension .&fr containing your own custo

    (somepath)&fr)yourfie.&fr($. #suay *+, sto

    three types of paraeism

    1$ ata Paraesim - data !copies of the same components.$ Componnent Paraeism !e&ecute simutane/$ Pipeine !se0uentia$.

    +2

    uti-+ie 2ystem

    m3mkfs - create a mutifie !m3mkfs ctrfie mm3s - ist a the mutifiesm3rm - remove the mutifie

    m3cp - copy a mutifie

    m3mkdir - to add more directories to e&isting

    emory re0uirements of agraph

    5ach partition of a component uses' 6

    Add si:e of ookup fies used in phase

    once$ utipy by degree of paraeism. Add

    is used in that phase.

    2eect the argest-memory phase in the

    >#P2CA4?@T>#P2can foowed by edup sort and seect the a

    dedup sort with nu key @f we dont use any key in the sort component

  • 8/11/2019 Abinitio Material

    2/11

    then the output depends on the keep paramete

    first - ony the first record

    ast - ony ast record

    uni0ue3ony - there wi be no records

    Boin on partitioned fowfie1 !A"8"C$ " fie !A"8"$. ?e partition botshoud we partition by (A"8( D 4ot cear.

    checkin" checkout %ou can do checkin)checkout using the wi:ard

    how to have differentpasswords for QA andproduction

    parameteri:e the .dbc fie - or use environmen

  • 8/11/2019 Abinitio Material

    3/11

    the environment parameters re0uired by the p5nvironment.

    Aggregate vs ,oupAggregate - od component,oup - newer" e&tended" recommended to us

    !buit-in functions ike sum count avg min ma

    55" E5" Co-operatingsytem

    55 = 5nterprise etdata 5nvironme

    anaysis" dependency anaysis$. @t is ontransformations" config info" source anyou checkin)checkout. )ProBect dir of 5sandbo&es connected to it. @t aso hepair commands to manipuate repository

    E5 = Eraphica evopment 5nviro

    Co-operating sytem = Ab @nitio server

    fencing

    fencing means Bob controing on priority basi@n A@ it actuay refers to customi:ed phase brsource data voume process wi not cough in processes.

    +encing - changing a priority of a BobPhasing - managing the resources to avoid dea+or e&ampe" imiting the number of simutan!by breaking the graph into phases" ony 1 of w

    Continuous components

    Continuous components - produce usefu outp

    Continuous roup" Continuous update batch s

    QuestionAnswer================================

    deadockeadock is when two or more proceresource. To avoid use phasing and r

    environment

    A83

  • 8/11/2019 Abinitio Material

    4/11

    wrapper script uni& script to run graphs

    mutistage component

    A mutistage component is a componin F stages !1.input seect" .temporaoutput seection" F.finai:e$. 2o it is

    packages. 5&ampes' scan 4ormai:normai:e and denormai:e sorted.

    ynamic >

    ynamic > is used if the input mdifferent time different input fies ardifferent dm. in that case we can useread in the input fie recieved and acdm is used.

    fan in" fan out

    fan out - partition component

    fan in departition component

    ocka user can ock the graph for editing and can not edit the same graph.

    Boin vs ookup>ookup is good for spped for sma fmemory$. +or arge fies use Boin. %oimit to hande big Boins.

    muti update muti update e&ecutes 2Q> statemen

    competey separate piece of work.

    scheduer

    ?e can use Autosys" Contro

    ?e can take care of depende

    scripts shoud run se0uentiaAutosys" or we can create a wse0uentia commands !nohupcommand.ksh K; etc$. ?e c@nitio to e&ecute individua s

    Api and #tiity modes in input tabe

    These are database interfaces !api - u

    whatever vendor provides$

    ookup fie

    ookup fie component. +unc

    ookup3ne&t" ookup3match"

    >ookups are aways used wit

    components.

  • 8/11/2019 Abinitio Material

    5/11

    Caing stored proc in 8%ou can ca stored proc !for e&ampyou can even write 2P in Ab @nitio. good performance.

    +re0uenty used functions string3trim" string3rtrim" string3subnow!$

    data vaidation is3vaid" is3nu" is3bank" is3define

    driving port

    ?hen Boining inputs !inG" in1" ...$ ondefaut - inG$. riving input is usuasmaest can have (2orted-@nput( parsorted( because it wi be oaded com

    Ab @nitio vs @nformatica for 5T>

    Ab @nitio benefits' paraeism buit iamounts of data" easy to buid and ru

    easiy modified as needed $if somethitsef$. The scripts can be easiy scheand easiy integrated with other syste

    Ab @nitio doesnt re0uire a dedicated

    Ab @nitio doesnt have buit-in CC Capture$.

    Ab @nitio aows to !attach error ) reBcapture and anay:e the message and

    @nformatica which has Bust one hugemetrics for each component.

    override keyoverride key option is used when wedifferent fied names.

    contro fiecontro fie shoud be in the mutifiethe seria fies$

    ma&-core

    ma&-core parameter !for e&ampe" soof memory used by a component !ikbefore spiing to disk. #suay you ddefaut vaue. 2etting it too high mayof 2 swapping and degrading of th

    @nput Parameters

    graph J seect parameters tab J cick#sage' Nparamname. 5dit J parametsubstituted during run time. %ou mayscope as forma.

  • 8/11/2019 Abinitio Material

    6/11

    5rror Trapping

    5ach component has reBect" error" anrecords" 5rror captures correspondine&ecution statistics of the componeneach component by setting reBect thron first reBect" or setting ramp)imit.

    function in transform function.

    /

    QuestionAnswer=================================

  • 8/11/2019 Abinitio Material

    7/11

    tuning performance Eo parae using partitionning. ,ound

    baance. #se uti-fie system !+2$.

    #se Ad

  • 8/11/2019 Abinitio Material

    8/11

    inimi:e the use of reguar e&pression

    transfer functions Avoid repartitioning of data unnecessa

    more than two fows" use ,eformat rat +or Boining records from fows use C

    there is a need to foow some specificis re0uired then it is preferabe to use E

    @nstead of putting many ,eformat com

    inde&es parameter in the first ,eformacondition there.

    deta tabe

    eta tabe maintain the se0uencer of

    aster !or base$ tabe - a tabe on tp o

    scan vs rouproup - performs aggregate cacuations on grtotas

    packages used in mutistage components or transform c

    ,eformat vs (,edefine +ormat(

    ,eformat - deriving new data by addin

    ,edefine format - rename fieds

    Conditiona > > which is separated based on a condition

    2,T?@T

  • 8/11/2019 Abinitio Material

    9/11

    # ,his grph is using the input 1 cd $I_R+* /my_grph)ksh $I*P+,_-I.'_PR0', # ,his grph 6so is using the inpu /my_grph4ksh $I*P+,_-I.'_PR0', eit

    e6se echo Insu11icient prmeterseit )

    1i3333333333333333333333333333333333333#!/bin/ksh

    #Running the set up script on enviorntypeset PROJ_DIR $(cd $(dirnme $"/ $PROJ_DIR/b_pro&ect_setupksh $PRO

    #'porting the script prmeter) to Ieport I*P+,_-I.'_*0' $)

    # ,his grph is using the input 1i6ecd $I_R+*/my_grph)ksh

    # ,his grph 6so is using the input /my_grph4ksh

    eit

  • 8/11/2019 Abinitio Material

    10/11

    ne&t3in3se0uence!$ function in your transformvaues( component. r you can write a stored

    4ote' if you use partitions" then do something

    !ne&t3in3se0uence!$-1$no3of3partition!$9thi

    .abinitiorcThis is a config fie for ab initio - in users homNA83

  • 8/11/2019 Abinitio Material

    11/11

    runs0

    Boin with db

    compression components

    fiter by e&pression

    sort !singe or mutipe keys$

    roup trash

    partition by e&pression ) partition by k

    running on hosts

    coJoperating system is ayered on top of nativE5" E5 generates a script !according to (re&ecute the scripts on different machines !usinconnection methods" ike re&ec tenet rsh rogcodes back.

    conventiona oading vs directoading

    This is basicay an race 0uestion - regardinConventiona oad - using insert statements. Awi be checked" a inde&es wi be updated.

    irect oad - data is written directy bock by bpartition. 2ome constraints are checked" inde&native options to skip inde& maintenance.

    semi-Boin

    in abinitio there are / types of Boins' inner Boin

    for inner Boin record3re0uired4 param

    for outer Boin it is fase for a the (in(

    for semi Boin it is true for the re0uired

    components.

    http'))www.geekinterview.com)@nterview-Questions)ata-?arehouse)Abinitio)page1G