Upload
sarveshmishra
View
217
Download
0
Embed Size (px)
Citation preview
8/11/2019 Abinitio Material
1/11
1
Question Answer ===========================
Phases vs Checkpoints
Phases - are used to break the graph into piece
after its competion. Phases are used to effectiCP#" disk$ parts of the appication.
Checkpoints - created for recovery purposes. Tcan recover to the atest saved point - and reru
%ou can have phase breaks with or without ch
&frA new sandbo& wi have many directories' mwith e&tension .&fr containing your own custo
(somepath)&fr)yourfie.&fr($. #suay *+, sto
three types of paraeism
1$ ata Paraesim - data !copies of the same components.$ Componnent Paraeism !e&ecute simutane/$ Pipeine !se0uentia$.
+2
uti-+ie 2ystem
m3mkfs - create a mutifie !m3mkfs ctrfie mm3s - ist a the mutifiesm3rm - remove the mutifie
m3cp - copy a mutifie
m3mkdir - to add more directories to e&isting
emory re0uirements of agraph
5ach partition of a component uses' 6
Add si:e of ookup fies used in phase
once$ utipy by degree of paraeism. Add
is used in that phase.
2eect the argest-memory phase in the
>#P2CA4?@T>#P2can foowed by edup sort and seect the a
dedup sort with nu key @f we dont use any key in the sort component
8/11/2019 Abinitio Material
2/11
then the output depends on the keep paramete
first - ony the first record
ast - ony ast record
uni0ue3ony - there wi be no records
Boin on partitioned fowfie1 !A"8"C$ " fie !A"8"$. ?e partition botshoud we partition by (A"8( D 4ot cear.
checkin" checkout %ou can do checkin)checkout using the wi:ard
how to have differentpasswords for QA andproduction
parameteri:e the .dbc fie - or use environmen
8/11/2019 Abinitio Material
3/11
the environment parameters re0uired by the p5nvironment.
Aggregate vs ,oupAggregate - od component,oup - newer" e&tended" recommended to us
!buit-in functions ike sum count avg min ma
55" E5" Co-operatingsytem
55 = 5nterprise etdata 5nvironme
anaysis" dependency anaysis$. @t is ontransformations" config info" source anyou checkin)checkout. )ProBect dir of 5sandbo&es connected to it. @t aso hepair commands to manipuate repository
E5 = Eraphica evopment 5nviro
Co-operating sytem = Ab @nitio server
fencing
fencing means Bob controing on priority basi@n A@ it actuay refers to customi:ed phase brsource data voume process wi not cough in processes.
+encing - changing a priority of a BobPhasing - managing the resources to avoid dea+or e&e" imiting the number of simutan!by breaking the graph into phases" ony 1 of w
Continuous components
Continuous components - produce usefu outp
Continuous roup" Continuous update batch s
QuestionAnswer================================
deadockeadock is when two or more proceresource. To avoid use phasing and r
environment
A83
8/11/2019 Abinitio Material
4/11
wrapper script uni& script to run graphs
mutistage component
A mutistage component is a componin F stages !1.input seect" .temporaoutput seection" F.finai:e$. 2o it is
packages. 5&es' scan 4ormai:normai:e and denormai:e sorted.
ynamic >
ynamic > is used if the input mdifferent time different input fies ardifferent dm. in that case we can useread in the input fie recieved and acdm is used.
fan in" fan out
fan out - partition component
fan in departition component
ocka user can ock the graph for editing and can not edit the same graph.
Boin vs ookup>ookup is good for spped for sma fmemory$. +or arge fies use Boin. %oimit to hande big Boins.
muti update muti update e&ecutes 2Q> statemen
competey separate piece of work.
scheduer
?e can use Autosys" Contro
?e can take care of depende
scripts shoud run se0uentiaAutosys" or we can create a wse0uentia commands !nohupcommand.ksh K; etc$. ?e c@nitio to e&ecute individua s
Api and #tiity modes in input tabe
These are database interfaces !api - u
whatever vendor provides$
ookup fie
ookup fie component. +unc
ookup3ne&t" ookup3match"
>ookups are aways used wit
components.
8/11/2019 Abinitio Material
5/11
Caing stored proc in 8%ou can ca stored proc !for e&you can even write 2P in Ab @nitio. good performance.
+re0uenty used functions string3trim" string3rtrim" string3subnow!$
data vaidation is3vaid" is3nu" is3bank" is3define
driving port
?hen Boining inputs !inG" in1" ...$ ondefaut - inG$. riving input is usuasmaest can have (2orted-@nput( parsorted( because it wi be oaded com
Ab @nitio vs @nformatica for 5T>
Ab @nitio benefits' paraeism buit iamounts of data" easy to buid and ru
easiy modified as needed $if somethitsef$. The scripts can be easiy scheand easiy integrated with other syste
Ab @nitio doesnt re0uire a dedicated
Ab @nitio doesnt have buit-in CC Capture$.
Ab @nitio aows to !attach error ) reBcapture and anay:e the message and
@nformatica which has Bust one hugemetrics for each component.
override keyoverride key option is used when wedifferent fied names.
contro fiecontro fie shoud be in the mutifiethe seria fies$
ma&-core
ma&-core parameter !for e&e" soof memory used by a component !ikbefore spiing to disk. #suay you ddefaut vaue. 2etting it too high mayof 2 swapping and degrading of th
@nput Parameters
graph J seect parameters tab J cick#sage' Nparamname. 5dit J parametsubstituted during run time. %ou mayscope as forma.
8/11/2019 Abinitio Material
6/11
5rror Trapping
5ach component has reBect" error" anrecords" 5rror captures correspondine&ecution statistics of the componeneach component by setting reBect thron first reBect" or setting ramp)imit.
function in transform function.
/
QuestionAnswer=================================
8/11/2019 Abinitio Material
7/11
tuning performance Eo parae using partitionning. ,ound
baance. #se uti-fie system !+2$.
#se Ad
8/11/2019 Abinitio Material
8/11
inimi:e the use of reguar e&pression
transfer functions Avoid repartitioning of data unnecessa
more than two fows" use ,eformat rat +or Boining records from fows use C
there is a need to foow some specificis re0uired then it is preferabe to use E
@nstead of putting many ,eformat com
inde&es parameter in the first ,eformacondition there.
deta tabe
eta tabe maintain the se0uencer of
aster !or base$ tabe - a tabe on tp o
scan vs rouproup - performs aggregate cacuations on grtotas
packages used in mutistage components or transform c
,eformat vs (,edefine +ormat(
,eformat - deriving new data by addin
,edefine format - rename fieds
Conditiona > > which is separated based on a condition
2,T?@T
8/11/2019 Abinitio Material
9/11
# ,his grph is using the input 1 cd $I_R+* /my_grph)ksh $I*P+,_-I.'_PR0', # ,his grph 6so is using the inpu /my_grph4ksh $I*P+,_-I.'_PR0', eit
e6se echo Insu11icient prmeterseit )
1i3333333333333333333333333333333333333#!/bin/ksh
#Running the set up script on enviorntypeset PROJ_DIR $(cd $(dirnme $"/ $PROJ_DIR/b_pro&ect_setupksh $PRO
#'porting the script prmeter) to Ieport I*P+,_-I.'_*0' $)
# ,his grph is using the input 1i6ecd $I_R+*/my_grph)ksh
# ,his grph 6so is using the input /my_grph4ksh
eit
8/11/2019 Abinitio Material
10/11
ne&t3in3se0uence!$ function in your transformvaues( component. r you can write a stored
4ote' if you use partitions" then do something
!ne&t3in3se0uence!$-1$no3of3partition!$9thi
.abinitiorcThis is a config fie for ab initio - in users homNA83
8/11/2019 Abinitio Material
11/11
runs0
Boin with db
compression components
fiter by e&pression
sort !singe or mutipe keys$
roup trash
partition by e&pression ) partition by k
running on hosts
coJoperating system is ayered on top of nativE5" E5 generates a script !according to (re&ecute the scripts on different machines !usinconnection methods" ike re&ec tenet rsh rogcodes back.
conventiona oading vs directoading
This is basicay an race 0uestion - regardinConventiona oad - using insert statements. Awi be checked" a inde&es wi be updated.
irect oad - data is written directy bock by bpartition. 2ome constraints are checked" inde&native options to skip inde& maintenance.
semi-Boin
in abinitio there are / types of Boins' inner Boin
for inner Boin record3re0uired4 param
for outer Boin it is fase for a the (in(
for semi Boin it is true for the re0uired
components.
http'))www.geekinterview.com)@nterview-Questions)ata-?arehouse)Abinitio)page1G