104
PROF. NAVEEN BHATIA 1 PROBABILITY FUNDAMENTALS PROF. NAVEEN BHATIA

Probability Fundas

Embed Size (px)

Citation preview

Page 1: Probability Fundas

PROF. NAVEEN BHATIA 1

PROBABILITY FUNDAMENTALS

PROF. NAVEEN BHATIA

Page 2: Probability Fundas

PROF. NAVEEN BHATIA 2

PROBABILITY FUNDAMENTALS

PROBABILITY IS SCIENTIFIC APPROACH TO UNCERTAINITY.

STUDY OF MATHEMATICAL TECHNIQUES FOR MAKING QUANTATIVE INFERENCES ABOUT UNCERTAINITY.

THE COMMON VALUE TO WHICH RELATIVE FREQUENCY CONVERGE IS CALLED AS THE PROBABILITY OF DESIRED OUTCOME. FOR EXAMPLE

THE RELATIVE FREQUENCY OF HEADS APPROACHES A COMMON VALUE OF ½ AS THE NUMBER OF COIN TOSSES INCREASES.

PROBABIITY CAN BE APPLIED TO ENGINEERING AND SCIENCES AREA SUCH AS RLIABILITY, QUALITY CONTROL AND ANALYSIS OF QUEUES.

RELIABILITY: BRANCH OF ENGG. MAINTENANCE OF QUALITY IN MANUFACTURING PROCESS.QUALITY CONTROL:THE GOAL HERE IS TO MINIMISE THE NUMBER OF DEFECTS PRODUCED IN MANUFACTURING PROCESS. SINCE ITS NOT POSSIBLETO TEST EVERY ITEM , PROBABILITY CAN BE USED TO ANALYSE UNCERTAINITY.

Page 3: Probability Fundas

PROF. NAVEEN BHATIA 3

PROBABILITY FUNDAMENTALS

QUEUING: BRANCH OF ENGINEERING CONCERNED WITH THE ANALYSIS AND DESIGN OF SYSTEMS INVOLVING MULTIPLE SERVERS AND MULTIPLE CLIENTS IN WHICH CLIENTS MAY BE REQUIRED TO WAIT FOR SERVICE.

SIMILARILY WE CAN APPLY THE THEORY OF PROBABILITY IN FINANCIAL ENGINEERING. THE FUNDAMENTAL PRINCIPLE UNDERLYING THE PRINCIPLE OF FINANCIAL ENGINEERING IS PRINCIPLE OF NO ARBITRAGE.

THE PRINCIPLE ASSERTS THAT TWO SECURITIES THAT PROVIDETHE SAME FUTURE CASH FLOW AND HAVE THE SAME LEVEL OF RISK MUST SELL FOR THE SAME PRICE.

NO ARBITRAGE IMPLIES THAT THERE IS NO FREE LUNCH. SIMILARILY PORTFOLIO OPTIMISATION ( HIHGEST RISK ADJUSTED

RETURN). FINANCIAL ENGINEERS USE PROB AND STSATS TO MEASURE EXPECTED RETURN AND RISK FOR INDIVIDUAL SECURITY AND FOR PORTFOLIO.

.

Page 4: Probability Fundas

PROF. NAVEEN BHATIA 4

PROBABILITY FUNDAMENTALS

PROBABILITIES ARE LONG RUN RELATIVE FREQUENCIES. THIS PERSPECTIVE IN WHICH PROBABILITY IS CONSIDERED TO BE A CONSTANT LONG RUN RELATIVE FRQUENCY IS KNOWN AS OBJECTIVIST INTERPRETATION OF PROBABILITY.

THE OTHER OBJECTIVE IS KNOWN AS BAYESIAN OR SUBJECTIVIST INTERPRETATION IN WHICH PROBABILITIES ARE CONSIDERED TOBE MEASURE OF PERSONAL BELIEF.

Page 5: Probability Fundas

PROF. NAVEEN BHATIA 5

PROBABILITY FUNDAMENTALS

RANDOM VARIABLE: CONSIDER A CASE WHERE THE PAY-OFF IN A GAME OF ROLLING THE

DIE. THE PAYOFF IS AN EXAMPLE OF A RANDOM VARIABLE BECAUSE ITS VALUE VARIES IN A WELL DEFINED WAY ACCORDING TO OUTCOME OF ROLLING THE DIE.

ITS RANDOM BECAUSE THE UNDERLYING PROCESS ON WHICH IT DEPENDS ITSELF IS RANDOM.

FOR INSTANCE THE GAME IS ROLLING A DIE. THE PAYOFF IS$1 IF THE RESULT IS 2/3/4 ; PAYOFF IS$2 IF ITS 5 ;IF ITS 1 OR 6 WE LOOSE $3.

THE PAY-OFF X IS CONSIDERED AS A DISCRETE RANDOM VARIABLE BECAUSE ITS POSSIBLE VALUES BELONG TO DISCRETE SET {-3,1,2}

GENERALLY SPEAKING A RANDOM VARIABLE IS ANY QUANTITY WITH REAL VALUES THAT DEPENDS IN A WELL DEFINED WAYON SOME PROCESS WHOSE OUTCOMES ARE UNCERTAIN.

ALL THE POSSIBLE OUTCOMES OF THE EXPERIMENT IS KNOWN AS “SAMPLE SPACE” { 1,2,3,4,5,6} AND THE PAY-OFF FUNCTION

Page 6: Probability Fundas

PROF. NAVEEN BHATIA 6

PROBABILITY FUNDAMENTALS

RANDOM VARIABLE: X(1)= -3, X(2)=1; X(3)=1; X(4)=1;X(5)=2 AND X(6)=-3 NOTE THAT FOR DIFFERENT TYPES OF BETS THERE COULD BE MANY

DIFFERENT REAL VALUED FUNCTIONS THAT CAN BE DEFINED ON A GIVEN SAMPLE SPACE.

DISTRIBUTION OF PROBABILITY: IF WE ASSUME THAT DIE IS A BALANCED DIE THEN IT LEADS TO THE

CONCLUSION THAT EACH OUTCOME OF THE EPERIMENT HAS PROBABILITY =1/6. THIS 1/6 IS THE DISTRIBUTION OF PROBABILITY OVER THE SAMPLE SPACE.

WHAT WE NEED TO KNOW IS THE DISTRIBUTION OF PROBABILITY FOR THE PAY-OFF.

P(X=-3)= 1/3 OR P(X=1)= ½ AND P(X=2)= 1/6. THIS PROBABILITY FUNCTION SUPRESSES INFORMATION ON THE UNDERLYING EXPERIMENT, BUT WE ARE ACTUALLY WORRIED ABOUT IS PAY-OFF.

Page 7: Probability Fundas

PROF. NAVEEN BHATIA 7

PROBABILITY FUNDAMENTALS

Page 8: Probability Fundas

PROF. NAVEEN BHATIA 8

PROBABILITY FUNDMENTALS

TWO PROPERTIES OF PROBABILITY DISTRIBUTION 1. 0≤P(X)≤1 FOR ALL X ∑ P(X)=1 FOR ALL X THE FIRST PROPERTY INDICATES THAT EVERY FUNCTION VALUE OF PX IS A

RELATIVE FREQUENCY. THE SECOND PROPERTY INDICATES THAT ONE OF THE POSIBLE PAYOFFS WILL

OCCUR.

EXPECTED PAYOFF: ∑X.PX FOR ALL X. IN OUR EXAMPLE IT WIL BE=1/3*-3 +1/2*1 +2*1/6=-1/6 THE EXPECTED PAYOFF HERE REPRESENTS THE AVERAGE PAYOFF PER GAME IF

WE PLAY A LARGE NUMBER OF GAMES. E[X]=∑ x.p(x) FOR ALL X E[X] IS THE ARITHMETIC AVERAGE OF THE VALUES OF X WEIGHTED BY THE

PROBABILITY MASS FUNCTION px. TWO RANDOM VARIABLES Y1&Y2 ARE IDENTUCALLY DISTRIBUTED PY1(Y)=PY2(Y) FOR ALLY

Page 9: Probability Fundas

PROF. NAVEEN BHATIA 9

PROBABILITY FUNDMENTALS

CHOOSING PAY-OFFS WHEN DISTRIBUTIONS ARE NOT IDENTICAL IF EXPECTED PAY-OFF IS SAME, ONE HAS TO LOOK AT RANGE. RISK

AVERSE INVESTORS WILL SELECT ONE WITH LESS RANGE OF PAY-OFFS WHEN EXPECTED PAY-OFFS ARE SAME.

Page 10: Probability Fundas

PROF. NAVEEN BHATIA 10

PROBABILITY FUNDMENTALS

CLASSICAL PROBABILITY: TWO ASSUMPTIONS: NUMBER OF POSSIBLE OUTCOMES OF RANDOM EXPERIMENT IS FINITE. ALL OUTCOMES ARE EQUALLY LIKELY I.E EACH OUTCOME HAS THE SAME

PROBABILITY. P(E)=[E]/[S] HERE E IS THE NUMBER OF ELEMENTS IN THE EVENT AND S IS THE

NUMBER OF ELEMENTS IN THE SAMPLE SPACE. SUPPOSE A DIE IS ROLLED , WHATS THE PROBABILITY THAT THE OUTCOME IS A

MULTIPLE OF 3. THE EVENT IS {3,6} ; THEREFORE THE PROB. IS 2/6=1/3 ANOTHER EXAMPLE IF TWO COINS ARE TOSSED WHATS THE PROBABILITY THAT

BOTH THE COINS SHOW UP THE SAME FACE. HERE E=2 AND S=4 AND HENCE P=1/2. ONE MUST CHOOSE THE RANDOM SPACE IN SUCH A WAY THAT ALL THE

INFORMATION OF SAMPLE SPACE IS CAPTURED.

Page 11: Probability Fundas

PROF. NAVEEN BHATIA 11

PROBABILITY FUNDMENTALS

CLASSICAL PROBABILITY: A RECENT STUDY ON TEENAGE DRUG & ALCOGOL USE FOUND THAT ONE

IN 10 TEENAGERS USES MARIJUANA ATLEAST ONCEA MONTH AND FIVE ADMOTTED TO DRINKING ALCOHOL ONCE A WEEKAND ONE IN FOUR ADMITTED TO SMOKING CIGARETTES ON A DAILY BASIS.

THE RESEARCH ALSO FOUND THAT 10% OF THE RESPONDENTS ADMITTED TO SMOKING CIGARETTES DAILY AND TAKING ALCOHOL ONCE A WEEK.

5% CIGARETTES AND MARIJUANA 3% DRINKING ALCOHOL AND USING MARIJUANA 1% ALL THE THREE. IN THIS EXAMPLE ONE DOESN’T KNOW THE SIZE OF SAMPLE BUT ONE

CAN STILL DETERMINE THE PROBABILITIES 0≤P(E)≤ 1 FOR ALL EVENTS; P(S)=1 WHEN THE NUMBER OF EVENTS IS SMALL ONE CAN USE VENN

DIAGRAMS

Page 12: Probability Fundas

PROF. NAVEEN BHATIA 12

PROBABILITY FUNDMENTALS

CLASSICAL PROBABILITY: THREE EVENTS M: TENNEAGERS TAKING MARIJUANA; A : TEENAGERSCONSUMING

ALCOHOL ONCE A WEEK AND C: TEENAGERS ADMITS TO SMOKING CIGARETTES ON A DAILY BASIS.

P(M)=10%; P(A)= 20% ; P(C)= 25% ; P(C∩A)=10%; P(C∩M)=5% P(A∩M)=3%AND P(M∩A∩C)=1%

Page 13: Probability Fundas

PROF. NAVEEN BHATIA 13

PROBABILITY FUNDAMENTAL

OTHER PROPERTIES P(AC)=1-P(A): THIS FOLLOWS FROM THE FACT THAT P(A)& P(AC) ARE MUTUALLY

EXCLUSIVE AND SUM OF P(A)& P(AC)=1 P(A U B)= P(A)+P(B)- P(A∩B) P(AUBUC)= P(A)+P(B)+P(C)- P(A∩B)-P(B∩C)- P(A∩C)+ P(A∩B∩C) USING THIS LOGIC IF WE NEE TO FIND OUT THE PROBABILITY OF A TEENAAGER

INDULGING ATLEAST ONE OF THE THREE ACTIVITIES THIS WILL BE EQUAL TO =.10+.20+.25-.(10+.05+.03) +.01=0.38. 1-0.38= 0.62 DON’T ENGAGE IN ANY OF THE THREE ACTIVITIES.

CONDITIONAL PROBABILITY: IF WE ROLL A DICE , WHATS THE PROBABILITY THAT 2 WILL SHOW UP GIVEN THAT

ITS AN EVEN NUMBER P(A/B)= P(A∩B)/P(B)=[1/6]/[1/2]=1/3 CONSIDER TWO DICES ARE TOSSED. THERE ARE 36 POSSIBLE OUTCOMES.. IF d1 IS

THE VALUE OF UPFACE OF DICE 1 AND d2 IS THE VALUE OF UPFACE OF DICE 2.WE DEFINE TWO EVENTS

Page 14: Probability Fundas

PROF. NAVEEN BHATIA 14

PROBABILITY FUNDAMENTAL

CONDITIONAL PROBABILITY: A={(d1,d2): d1+d2=4} B={(d1,d2):d2≥d1} P(A)= 3/36{(1,3),(3,1)&(2,2)} P(B)=21/36 P(B/A)=2/3 P(A/B)=2/21 P(A∩B)=2/36=P(B)*(PA/B)=[21/36]*[2/21] =2/36 LETS LOOK AT ANOTHER EXAMPLE: A COED COLLEGE HAS THREE COURSES, SCIENCE, MANAGEMENT AND

ENGINEERING. BY SEX THE ENROLLMENT IS AS FOLLOWS SCIENCE MANAGEMENT ENGG TOTAL MALE 250 350 200 800 FEMALE 100 50 50 200 TOTAL 350 400 350 1000

Page 15: Probability Fundas

PROF. NAVEEN BHATIA 15

PROBABILITY FUNDAMENTAL

LET S1: STUDENT IS MALE; S2: STUDENT IS FEMALE;C1: THE STUDENT IS FROM SCIENCE; C2: THE STUDENT IS FROM MANAGEMENT; AND C3 THE STUDENT IS FROM ENGG.

C1 C2 C3 TOTAL S1 .25 .35 .20 .80 S2 .10 .05 .05 .20 . 35 .40 .25 1.00 P(S1)=0.80 AND P(S2)=0.20 ARE ALSO KNOWN AS MARGINAL PROBABILITIES P(C3/S2)= .05/.20=1/4 P(C1/S2)=.10/.20=1/2 P(C2/S2)=.05/.20=1/4 P(C1UC2UC3/S2)=P(C1/S2)+P(C2/S2)+P(C3/S2) IF TWO EVENTS ARE INDEPENENT THEN P(A∩B)=P(A).P(B) AND THEREFORE

P(A/B)=P(A) AND P(B/A)=P(B). (FOR EXAMPLE TOSSING A COIN TWICE) LETS LOOK AT AN EXAMPLE ON THE NEXT SLIDE.

Page 16: Probability Fundas

PROF. NAVEEN BHATIA 16

PROBABILITY FUNDAMENTAL

SYSTEM RELIABILITY GIVEN SUBSYSTEM RELIABILIY

SUB SYSTEM1 SUB SYSTEM2

SYSTEM

Page 17: Probability Fundas

PROF. NAVEEN BHATIA 17

PROBABILITY FUNDAMENTALS

IN THE PREVIOUS EXAMPLE , THE SYSTEM RELIABILITY IS RS= R1.R2 R1=0.9 AND R2=0.8 THEN RS= 0.9*0.8=0.72 WE CAN EXTEND THE SAME LOGIC TO N SUBSYSTEMS RS= R1.R2.R3.R4.R5.R6—RN SUBSYTEMS ARE MUTUALLY INDEPENDENT. BAYES THEOREM: IF B1,B2…BK REPRESENT A PARTITION OF S & A IS AN ARBITRATY EVENT

OF S THEN THE TOTAL PROBABILITY OF A IS GIVEN BY P(A)= P(B1).P(A/B1)+ P(B2).(PA/B2)…… WE WILL SEE A NUMERICAL EXAMPLE OF THIS VERY SHORTLY. P(BK/A)= P(BK∩A)/P(A)= P(BK).P( A/BK)/P(A)=P(BK).P(A/BK)/ P(B1).P(A/B1)+

P(B2).(PA/B2)……

Page 18: Probability Fundas

PROF. NAVEEN BHATIA 18

PROBABILITY FUNDAMENTALS

THREE FIRMS SUPPLY NPN TRANSISTORS TO A MANUFACTURER OF TELEMETRY EQUIPMENT.ALL ARE WITH SAME SPECS.

FIRM FRACTION DEFECTIVE FRACTION SUPPLIED BY 1 .02 .15 2 .01 .80 3 .03 .05 IF A DEFINES THE EVENT THAT ITEM IS DEFECTIVE. AND B1,B2,B3 DEFINE

WHETHER THE ITEM CAME FROM FIRM 1 2 OR 3. THEN P(B3/A)= PB3*P(A/B3)/ [(PB1.(PA/B1)) +(P(B2)*(PA/B2))+(P(B3)*(P(A/B3)] = .05*.03/[.15*.02+ .80*.01 +.05*.03]= 3/25

Page 19: Probability Fundas

PROF. NAVEEN BHATIA 19

PROBABILITY FUNDAMENTALS

ANOTHER EXAMPLE OF BAYE’S THEOREM: URN RED BALLS BLACK TOTAL 1 3 2 5 2 2 3 5 WE TOSS AN UNBIASE COIN TO DRAW A BALL FROM ONE OF THE URNS BUT WE

DON’T KNOW “WHICH IS WHICH URN”.SUPPOSE THE FIRST BALL DRAWN IS BLACK AND IS PUT BACK.WHAT IS THE PROBABILITY THAT THE SECOND BALL DRAWN FROM THE SAME URN IS ALSO BLACK.

P(U1)= ½ ( SELECTING URN 1 BY TOSS) % P(U2)=1/2 (SELECTING URN 2) THE EVENT B1 DENOTES THE FIRST BALL DRAWN IS BLACK &B2 THE SECOND BALL

DRAWN IS ALSO BLACK. P(U1/B1)=P(U1∩B1)/P(B1)=P(B/U1).P(U1)/[

PU1.P(B1/U1)+PU2.P(B1/U2)]=(1/2.2/5)/((1/2.2/5)+(1/2.3/5)=2/5 P(U2/B1)=3/5. WHEN THE FIRST BALL IS PUT BACK THE PROBABILITY THAT THE SECOND BALL IS

BLACK AND CAME FROM U1= 2/5 AND FROM U2=3/5

Page 20: Probability Fundas

PROF. NAVEEN BHATIA 20

PROBABILITY FUNDAMENTALS

ANOTHER EXAMPLE OF BAYE’S THEOREM: A FAMILY DO IS MISSING. THREE HYPOTHESIS ARE SUGGESTED. A.. IT HAS GONE HOME. B. ITS STILL WORRYING THAT BIG BONE IN THE

PICNIC AREA. 3. IT HAS WANDERED INTO WOODS. APRIORI PROBABILITIES AHICH HAVE BEEN OBSERVED FROM THE PAST

HABITS OF DOG SUGGEST THAT P(A)=1/4 P(B)=1/2 AND P(C)= ¼. ‘ A CHILD IS SENT TO SEARCH FOR DOG. IF ITS IN PICNIC AREA THEN THE

PROBABILITY OF FINDING DOG IS 90% AND IF ITS IN THE WOODS THE PROBABILITY IS 50%. WHATS THE PROBABILITY THAT DOG WILL BE FOUND IN THE PARK.

LET “D” DENOTE THE EVENT THAT DOG WILL BE FOUND IN THE PARK. P(D/A)=0 P(D/B)=90% P(D/C)= 50% P(D)= PA. (PD/A) + P(B).P(D/B) + P(C).P(D/C)=(1/4).0 +(1/2).0.9

+(1/4).5=115/200 WHATS THE PROBABILITY THAT DOG WILL BE FOUND AT HOME. LET D’

DENOTE THIS EVENT.

Page 21: Probability Fundas

PROF. NAVEEN BHATIA 21

PROBABILITY FUNDAMENTALS

ANOTHER EXAMPLE OF BAYE’S THEOREM: THEN P(D’)= P(A).(PD’/A)+P(B).(PD’/B)+P©.P(D’/C)=(1/4).(1)+1/2.0 +1/4.0=1/4 WHATS THE PROBABILITY THAT THE DOG IS LOST. 1-P(D)- P(D’). A BALL IS DRAWN FROM THE URN AND DISCARDED. W/O KNOWING ITS

COLOUR, WHATS THE PROBABILITY THAT THE SECOND BALL DRAWN IS ALSO BLACK?

P(B1)= b/(b+r) & P(BIC)= r/( b+r) P(B2/B1)=(b-1)/(b+r-1) P(B2) (I.E W/O KNOWING THE COLOR OF THE FIRST BALL)= P(B1).P(B2/B1)+ P(B1C).(PB2/B1C)= [b/{b+r}].[{b-1}/{b+r-1} +[r/{b+r}].{b/(b+r-1}= b.(b+r-1)/(b+r).(b+r-1)=b/(b+r)

Page 22: Probability Fundas

PROF. NAVEEN BHATIA 22

PROBABILITY FUNDAMENTALS

PERMUTATION&COMBINATION: FUNDAMENTAL RULE: IF THERE ARE MULTIPLE CHOICES TO BE MADE I.E THERE

ARE m1 POSSIBILITIES FOR THE FIRST CHOICE &m2 FOR THE SECOND CHOICE AND SO ON & IF THESE ARE ALLOWED TO BE COMBINED FREELY, THEN THE TOTAL NUMBER OF POSSIBLE CHOICES= m1 *m2* m3-----mn.

HOW MANY WAYS 6 DICES CAN APPEAR. THE ANSWER IS 66=46656. KOW MANY WAYS IF ALL HAVE TO SHOW UP DIFFERENT. THE ANSWERIS 6

Page 23: Probability Fundas

PROF. NAVEEN BHATIA 23

PROBABILITY FUNDAMENTALS

PERMUTATION&COMBINATION: FUNDAMENTAL RULE: IF THERE ARE MULTIPLE CHOICES TO BE MADE I.E THERE

ARE m1 POSSIBILITIES FOR THE FIRST CHOICE &m2 FnOR THE SECOND CHOICE AND SO ON & IF THESE ARE ALLOWED TO BE COMBINED FREELY, THEN THE TOTAL NUMBER OF POSSIBLE CHOICES= m1 *m2* m3-----mn.

HOW MANY WAYS 6 DICES CAN APPEAR. THE ANSWER IS 66=46656. KOW MANY WAYS IF ALL HAVE TO SHOW UP DIFFERENT. THE ANSWERIS

6!.=6.5.4.3.2.1=720. AFTER THE FIRST DICE SHOWS UP THE SECOND MUST SHOW UP A IFFERENT FACE & SO ON.

AN URN CONTAINS m DISTINGUISHABLE BALLS MARKED 1 TO m FROM WHICH n BALLS WILL BE DRAWN UNDER VARIOUS SPECIFIED CONDITIONS.

1. SAMPLING WITH REPLACEMENT AND WITH ORDERING. WE DRAW n BALLS SEQUENTIALLY, EACH BALL DRAWN BEING PUT BACK BEFORE

THE NEXT DRAWN IS MADE.WHAT WE RECORD IS THE NUMBER ON THE BALL TOGETHER WITH THEIR ORDER. THE ANSWER WILL BE WE ARE LOOKIG AT SEQUENCE OF n TUPLES( a1,a2,a3..an) WHERE EACH aj COULD BE 1 TO m. THEREFORE THE ANSWER IS mn.

Page 24: Probability Fundas

PROF. NAVEEN BHATIA 24

PROBABILITY FUNDAMENTALS

PERMUTATION&COMBINATION: 2.SAMPLING WITHOUT REPLACEMENT AND WITH ORDERING:

THE ANSWER IS m.(m-1).(m-2)….(m-n+1)=(m)n

2.a): PERMUTATION OF m DISTINGUISHABLE BALLS.THE ANSWER IS m!. 3. SAMPLING WITHOUT REPLACEMENT AND WITHOUT ORDERING:

HERE THE ORDER OF SEQUENCE IS NOT RECORDED (I.E123 IS SAME AS 321) I.E IF E\DRAW N BALLS IN ONE GRAB: FOR EXAMPLE IF m=5 AND n=3 THEN {3,5,2} CAN BE DRAWN IN 6 WAYS

(3!) THEREFORE THE ANSWER IS (m)n/n! THE EXPRESSION ABOVE IS = m!/(n!.(m-n)!) THE EXPRESSION IS ALSO KNOWN AS BINOMIAL COEFFICIENT. PERMUTATION OF m BALLS THAT ARE DISTINGUISHABLE BY GROUPS SUPPOSE THERE m1 BALLS OF 1 COLOUR, m2 BALLS OF SECOND COLOUR ANDSO

ON.OFCOURSE m1+m2+m3+….mr=m. HOW MANY DISTINGUISHABLE ARRANGEMENTS ARE THERE?. FOR INSTANCE IF m1=m2=2 and m=4 AND THE TWO COLOURS ARE BLACK AND WHITE THEN THE TOTAL NUMBER OF PERMULTATION=m!/[m1!.m2!..mr!). IN OUR EXAMPLE THE ANSWER IS 6.

Page 25: Probability Fundas

PROF. NAVEEN BHATIA 25

PROBABILITY FUNDAMENTALS

PERMUTATION&COMBINATION: 2.SAMPLING WITH REPLACEMENT AND WITHOUT ORDERING: SUPPOSE WE TOSS TWO COINS m=2 and n=2 (HT=TH) THEN THERE ARE THREE

POSSIBILITIES = TT/HH/TH(HT). = (m+n-1m-1)=3!/2!.1!=3. IF 6 DICES ARE ROLLED AND THE DICE ARE NOT DISTINGUISHABLE THEN THE NUMBER

OF DISTINGUISHABLE PATTERNS IS =11!/6!.5!=462 SOME EXAMPLES: EXAMPLE 1: 6 MOUNTAINEERS DECIDE TO DIVIDE INTO THREE GROUPS FOR THE FINAL

ASSAULT ON THE PEAK.THE GROUPS WILL BE OF SIZE 1,2& 3 RESPECTIVELY. HOW MANY WAYS ITS POSSIBLE?

=6!/(3!.2!.1!)=60 HAVING FORMED THESE GROUPS WHICH GROUP LEADS/MIDDLE AND SO ON=3!. THE

FINAL ANSWER IS =60*3=180.

Page 26: Probability Fundas

PROF. NAVEEN BHATIA 26

PROBABILITY FUNDAMENTALS

PERMUTATION&COMBINATION: IF A DECK OF CARDS IS SHUFFLED THOROUGHLY WHATS THE PROBABILITY THAT

FOUR ACES APPEAR IN A SEQUENCE? TOTAL NUMBER OF PERMUTATIONS CAN BE 52!. THE FOUR ACES CAN APPEAR ANYWHERE IN 49 PLACES . FURTHER FOUR ACES

WITHIN THESE 49 PLACES CAN BE COMBINED IN 4! WAYS . THE OTHER 48 CARDS CAN AGAIN APPEAR IN 48! WAYS.

THEREFORE THE FINAL ANSWER IS 49*4!*48!/52!=24/52.51.50=0.018%.

ANOTHER EXAMPLE: FIFTEEN NEW STUDENTS ARE TO BE DISTRIBUTED EVENLY AMONG THREE

CLASSES. THREE ARE WHIZ KIDS. WHATS THE PROBABILITY THAT EACH GETS ONE. TOTAL NUMBER OF WAYS = 15!/(5!.5!.5!) THERE ARE 6 WAYS IN WHICH THREE WHIZ KIDS CAN BE DISTRIBUTED AMONG

THREE CLASSES. THE BALANCE 12 CAN BE DISTRIBUTED IN 12!/4!.4!.4!. THEREFORE THE FINAL ANSWER IS[ 6.12!/(4!.4!.4!)]/[15!/(5!.5!.5!)

=207900/756756=27.47%

Page 27: Probability Fundas

PROF. NAVEEN BHATIA 27

PROBABILITY FUNDAMENTALS

PERMUTATION&COMBINATION: A PRODUCTION LOT SIZE OF 100 IS KNOWN TO BE 5% DEFECTIVE. A RANDOM

SAMPLE OF 10 ITEMS IS SELECTED W/O REPLACEMENT. WHATS THE PROBABILITY THAT THERE IS NO DEFECTIVE IN THE SAMPLE?

THE NUMBER OF WAYS IN WHICH 10 ITEMS CAM BE DRAWN OUT OF 100( HERE THE

SEQUENCE IS NOT IMPORTANT)= 100!/(90!.10!)=1.73*1013 . THE NUMBER OF

WAYS THAT SAMPLE CONTAINS NO DEFECTIVE=[5!/(0!.5!)] .[95!/(10!.85!)=1.01*1013

THEREFORE THE PROBABILITY THAT THE SAMPLE CONTAINS NO DEFECTIVE= .58375%

TO GENERALISE THIS EXAMPLE , WE CAN SAY THAT ANY POPULATION OF N WHERE D BELONG TO A PARTICULAR CLASS. A RANDOM SAMPLE OF n IS SELECTED W/O REPLACEMENT. IF A DENOTES THE EVENT OF OBTAINING EXACTLY r ITEMS FROM THE CLASS OF INTEREST, THEN

P(A)=(D r).(N-D n-r)/(N n)

Page 28: Probability Fundas

PROF. NAVEEN BHATIA 28

PROBABILITY FUNDAMENTALS

COMBINATIONS

(n r)=(n n-r)(n r)=(n-1 r-1) +(n-1 r)

Page 29: Probability Fundas

PROF. NAVEEN BHATIA 29

PROBABILITY FUNDAMENTALS

CONTINUOUS RANDOM VARIABLES:

P(a≤x≤b)= ab ∫fx(x)dx :

f(x) IS THE PROBABILITY DENSITY FUNCTION SATISFIES THE FOLLOWING CONDITIONS

fx(x)≥0 FOR ALL x OVER THE RANGE R &

R ∫fx(x)dx=1

WHEN WE HAVE A DENSITY FUNCTION OR PROBABILITY AT APOINT SAY XO THEN THE PROBABILITY DENSITY AT A POINT BECAUSE OF INTEGERATION FORMULA IS ZERO.

THE TIME TO FAILURE OF A CATHODE TUBE IS DESCRIBED BY THE FOLLOWING FUNCTION: f(t)= λe-λt fot t≥ 0 and 0 OTHERWISE. WHERE λ>0 IS KNOWN AS CONSTANT FAILURE RATE. WHATS THE PROBABILITY THAT P(T≥100 HRS)

Page 30: Probability Fundas

PROF. NAVEEN BHATIA 30

PROBABILITY FUNDAMENTALS

f(t) =λe-λt for t ≥0 and 0 otherwise.

P(T≥100 HRS)= 100 ∞∫λe-λt =e-100λ

P(T≥100 HRS/T>99)= [100 ∞∫λe-λt]/ [99 ∞∫λe-λt

= e-100λ/e-99λ =e-λ.

ANOTHER EXAMPLE: f(x)= x for 0≤x<1 and 2-x for 1≤x<2 and=0 otherwise. LETS LOOK AT THE GRAPHICAL REPRESENTATION

Page 31: Probability Fundas

PROF. NAVEEN BHATIA 31

PROBABILITY FUNDAMENTALS

0 1 2

f(x)

x

Page 32: Probability Fundas

PROF. NAVEEN BHATIA 32

PROBABILITY FUNDAS

THE PROBABILITY THAT -1<X<1/2=

-1 0∫ 0dx=0 + 0 1/2∫xdx= [(1/2)2 – (0)2]/2=1/8

SIMILARIIY WE CAN FIND THAT P(X≤3/2)= 7/8. P(X≤3)=1 &P(X≥2.5)=0 IN DESCRIBING PROBABILITY FUNCTIONS A MATHEMATICAL MODEL IS

USUALLY EMPLOYED. THE AREA UNDER THE DENSITY FUNCTIONCORRESPONDS TO PROBABILITY AND THE TOTAL AREA IS 1.

MEAN OF THE RANDOM VARIABLE:

µ=Σxip(xi) FOR DISCRETE X

µ= -∞ ∞∫xf(x)dx FOR CONTINUOUS X

Page 33: Probability Fundas

PROF. NAVEEN BHATIA 33

PROBABILITY FUNDAS

CONSIDER A COIN TOSSING EXPERIMTENT (THREE TOSSES) AND WHERE X THE RANDOM VARIABLE REPRESENTS THE NUMBER OF HEADS.

P(O HEADS)= 1/8 P(1H)=3/8 (3/8.1) ; P(H=2) =3/8( 3.8.2) ;P(H=3) =1/8(3.1/8)

µ=12/8= 3/2 SIMILARILY FOR THE DENSITY FUNCTION THE

µ = 0 1∫x.xdx + 1

2∫x.(2-x)dx=1/3 +({4-1} –{8/3-1/3})

=1/3 +({3-7/3})=1/3 +({2/3})=1. VARIANCE: SPREAD OR DISPERSION AROUND THE MEAN

σ2 = Σ(xi -µ)2. p(xi) FOR DISCRETE X.

σ2 =-∞ +∞ ∫(xi -µ)2. f(x)dx FOR CONTINUOUS X.

Page 34: Probability Fundas

PROF. NAVEEN BHATIA 34

PROBABILITY FUNDAS

FOR THE COIN TOSSING EXPERIMENT ( TWO TOSSES) σ2 =(0-1)2 .(1/4) +(1-1)2 .(1/2) +(2-1)2 .(1/4)=1/2 THE UNITS OF THE RANDOM VARIABLE AND THE MEAN ARE THE

SAME. WHERE AS THE UNITS OF THE VARIANCE ARE SQUARED. ANOTHER MEASURE OF DISPERSION IS CALLED STANDARD

DEVIATION IS SQUARE ROOT OF VARIANCE =σ. ANOTHER FORMULA FOR VARIANCE

σ2 = ΣXi2.p(xi)-µ2 =σ2 =E(xi)2 -[E(x)]2

Page 35: Probability Fundas

PROF. NAVEEN BHATIA 35

PROBABILITY FUNDAS

PROPERTIES OF VARIANCE: VAR(C)= 0 FOR ALL CONSTANTS. VAR (mX +b)= m2 .VAR(x) VAR ( x+y)= Var(x) +Var(y); if x&y are indepednent VAR(X+Y)= Var(x)+ 2COV(x,y)+Var(y)

Page 36: Probability Fundas

PROF. NAVEEN BHATIA 36

PROBABILITY FUNDAS

SKEWNESS: ONE POSSIBLEMEASURE IS E[(X-µX)3].IF THIS NUMBER IS >0 THEN THE

DISTRIBUTION IS SKEWED TO THE RIGHT AND IF ITS < O THEN DISTRIBUTION IS SKEWED TO THE LEFT.

TO DESCRIBE THE DEGREE OF THE SKEWNESS THE STATSTIC THAT’S OFTEN USED IS

γx = E[(x-µx/σx)3] IN A POSITIVELY SKEWED DISTRIBUTION , THE MODE IS AT THE HIGHEST

POINT OF DISTRIBUTION, THE MEDIAN IS TO THE RIGHT OF THAT AND THE MEAN IS TO THE RIGHT OF BOTH THE MEAN AND THE MEDIANJ.

KURTOSIS THE DIFFERENCE IN PEAKEDNESS IN TWO DISTRIBUTIONS:

=E[(x-µx/σx)4]

Page 37: Probability Fundas

PROF. NAVEEN BHATIA 37

PROBABILITY FUNDAS

E[(x-µx/σx)4] KURTOSIS IS A MEASURE THAT MEASURES WHETHER A DISTRIBUTION IS MORE OR LESS

PEAKED THAN A NORMAL DISTRIBUTION. LEPTO KURTIC: MORE PEAKED THAN A NORMAL PLATYKUTIC: LESS PEAKED THAN A NORMAL. KURTOSIS ALSO MEANS THAT A DISTRIBUTION IS MORE PEAKED AND HAVING FATTER TAILS. THIS

IMPLIES MORE RETURNS CLUSTERED AROUND THE MEAN AND MORE RETURNS WITH LARGE DEVIATIONS FORM THE MEAN.

FOR ALL NORMAL DISTRIBUTIONS KURTOSIS IS EQUAL TO 3 . EXCESS KURTOSIS IS MEASURED KURTOSIS LESS THREE. A LEPTOKURTIC DISTRIBUTION HAS EXCESS KURTOSIS GREATER THAN ZERO AND A PLATYKURTIC DISTRIBUTION HAS EXCESS KURTOSIS LES THAN ZERO.

MOST EQUITY RETURN SERIES HAVE BEEN FOUND TO BE LEPTOKURTIC THIS IMPLIES THAT IF WE USE STATASTICAL MODEL THAT DON’T ALOW FOR FATTER TAILS, WE

WILL UNDERESTIMATE THE LIKELIHOOD OF VERY BAD OR VERY GOOD OUTCOMES. FOR EXAMP0LE THE RETURN OF S&P 500 FOR 19TH OCTOBER 1997 WAS 20 STANDARD

DEVIATIONS AWAY FROM THE MEAN DAILY RETURN . IF DAILY RETURNS ARE NORMALLY DISTRIBUTED THAN RETURNS FOUR STANDARD DEVIATIONS

SHOULD OCCUR ONCE EVERY 50 YEARS. THE MONTHLY RETURN SERIES OF S&P 500 RETURNS HAS VERY LARGE KURTOSIS (9.5) AND BY

CONTRAST THE ANNUAL S&P RETURN SERIES HAS VERY SMALL NEGATIVE KURTOSIS (-0.2)

Page 38: Probability Fundas

PROF. NAVEEN BHATIA 38

PROBABILITY FUNDAS

POSITIVE SKEW NEGATIVE SKEW

Page 39: Probability Fundas

PROF. NAVEEN BHATIA 39

PROBABILITY FUNDAS

INDEPENDENCE OF RANDOM VARIABLES: X1,X2 ARE INDEPEDNENT IF AND ONLY IF p(x1i,x2j)=p(x1i).p(x2j) FOR

ALL i&j. IF THE VARIABLES ARE CONTINUOUS WE SAY THAT THEY ARE INDEPENDENTIF

AND ONLY IF f(x1,x2)= f1(x1).f2(x2) IF X1 AND X2 ARE INDEPENDENT, THEN WE CAN SAY THAT

COVARIANCE & CORRELATION: COVARIANCE= E[(X-µX).(Y-µY)]

IT CAN BE SHOWN THAT COV(X,Y)= E(XY)-E(X).E(Y) THEREFORE IF X& Y ARE INDEPENDENT THE COV(X,Y)=0. NOTICE THAT FOR ANY GIVEN PAIR OF VALUES OF X&Y THE

COVARIANCE WILL BE POSITIVE IF BOTH X& Y ARE IN THE SAME DIRCTION AROUND THEIR MEAN..WHERE AS IT WILL BE NEGATIVE IF ONE IS ABOVE THE AVERAGE AND OTHER IS BELOW THE AVERAGE.

Page 40: Probability Fundas

PROF. NAVEEN BHATIA 40

PROBABILITY FUNDAS

INDEPENDENCE OF RANDOM VARIABLES: COVARIANCE & CORRELATION: CORRELATION IS A STATASTIC THAT IS A MEASURE OF BOTH THE SIGN

AND DEGREE OF ASOCIATION BETWEEN TWO RANDOM VARIABLES. ρxy =Cov(x,y)/[σx.σy] -1≤ ρxy≤1 THIS PROPERTY FOLLOWS FROM CAUCHY-SCHWARZ INEQUALITY E(XY)2≤E(X2].E[Y2] THEREFORE WE CAN SAY THAT E[(X-µX)(Y-µY)]2≤E(X-µX)2.E(Y-µY)2.

THEREFORE IT FOLLOWS THAT ρ2≤ 1 & -1≤ ρxy≤1. ANOTHER IMPORTANT EQUALITY IS Var(ax+by)= a2 Var(x)+ 2abCov(xy) +b2Var(y)

Page 41: Probability Fundas

PROF. NAVEEN BHATIA 41

PROBABILITY FUNDAS

INDEPENDENCE OF RANDOM VARIABLES: COVARIANCE & CORRELATION:

Page 42: Probability Fundas

PROF. NAVEEN BHATIA 42

PROBABILITY FUNDAS

SOME IMPORTANT DISCRETE DISTRIBUTIONS: BERNOULLI TRIALS: A TRIAL WITH TWO POSSIBILITIES S/F IS KNOWN AS

BERNOULLI TRIALS. SUPPOSE WE CONDUCT N BERNOULLI TRIALS ITS CALLED A BERNOULLI

PROCESS IF THE TRIALS ARE INDEPENDENT AND EACH TRIAL HAS TWO POSSIBLE OUTCOMES {S} OR{F} & THE PROBABILITY OF SUCCESS REMAINS CONSTANT FROM TRIAL TO TRIAL.

FURTHER FOR EACH SUCCESS YOU GET 1 AND FAILURE YOU GET 0 LET p BE THE PROBABILITY OF SUCCESS AND 1-p BE THE PROBABILITY

OF FAILURE ORLETS CALL IT q ; q BEING EQUAL TO 1-p E(Xi)= 0.q +1.p= p . V(Xi)= {[02.q]+12.p –p2= p.(1-p) SUPPOSE AN EXPERIMENT CONSISTS OF THREE BERNOU;;I TRIALS AND

THE PROBABILITY OF SUCCESS IS p ON EACH TRIAL . THE RANDOM VARIABLE x DEFINES THE NUMBER OF SUCCESSES. IT CAN TAKE ANY VALUE FROM 0 TO 3, AS SHOWN IN THE NEXT SLIDE.

Page 43: Probability Fundas

PROF. NAVEEN BHATIA 43

PROBABILITY FUNDAS

BERNOULLI TRIALS: Rx

FFF 0FFSFSFSFFFSSSFSSSF SSS

1

2

3

px

q.q.q=q3

3.p.q2

3.p2q

p3

Page 44: Probability Fundas

PROF. NAVEEN BHATIA 44

PROBABILITY FUNDAS

THE BINOMIAL DISTRIBUTION: THE RANDOM VARIABLE x THAT DENOTES THE NUMBER OF

SUCCESSES IN n BERNOULLI TRIALS HAS A BINOMIAL

DISTRIBUTION GIVEN BY p(x) =(n x)px.(1-p)n-x FOR x=0,1,2,3…

n . THE MEAN CAN BE DETERMINED BY

E(X)= 0

n Σx. n!px.qn-x/ x!.(n-x)!

E(x) =np . 1 n Σ (n-1)!px-1.qn-x/ (x-1)!.(n-x)!

LETTING y= x-1

Page 45: Probability Fundas

PROF. NAVEEN BHATIA 45

PROBABILITY FUNDAS

THE BINOMIAL DISTRIBUTION:

E(x) =np . 0 n-1 Σ (n-1)!py.qn-1-y/ (y)!.(n-1-y)!

THE SUMMATION OF TERMS = 1 AND THEREFORE E(X)=np. SIMILARILY IT CAN BE SHOWN THAT VARIANCE = npq. A MUCH EASIER APPROACH WOULD HAVE BEEN TO CONSIDER X

AS A SUM OF n INDEPENDENT RANDOM VARIABLES, EACH WITH MEAN p AND VARIANCE=pq, SO THAT X= X1+X2+X3..Xn then

E(X)= p +p+p.. n times = np & SIMILARILY VARIANCE = npq. THE CUMULATIVE BINOMIAL DISTRIBUTION FUNCTION F HAS

BEEN EXTENSIVELY TABLED.

Page 46: Probability Fundas

PROF. NAVEEN BHATIA 46

PROBABILITY FUNDAS

THE BINOMIAL DISTRIBUTION:

THE BINOMIAL DISTRIBUTION IS SYMEETRIC IF p =0.5. THIS FOLLOWS FROM THE FACT THE NUMBER OF WAYS IN WHICH 3

SUCCESSES CAN BE ACHIEVED OUT OF 5 IS SAME AS 2 . SIMILARILY 1 OUT OF 5 IS SAME AS 4 OUT OF 5.

Page 47: Probability Fundas

PROF. NAVEEN BHATIA 47

PROBABILITY FUNDAS

THE GEOMETRIC DISTRIBUTION: THIS IS ALSO RELATED TO A SEQUENCE OF BERNOULLI TRIALS. THE RANDOM VARIABLE HERE IS DEFINED AS NUMBER OF

TRIALS REQUIRED TO ACHIEVE THE FIRST SUCCESS. P(x)= qx-1.p x=1,2.. IT CAN BE SEEN EASILY THAT IT’S A PROBABILITYDISTRIBUTION

SINCE Σp.qx-1 (where x varies from 1 to∞) =p Σqk k varies from 0 to∞ . THIS IS EQUAL TO p.[1/(1-q)] = 1.

µ= E(x)= 1

∞Σx.p.qx-1 = p.d/dq[Σqx]=p.d/dq[q/(1-q)]

E(x)= p.1/[1-q]2 = 1/p SIMIARILY VARIANCE = q/p2

Page 48: Probability Fundas

PROF. NAVEEN BHATIA 48

PROBABILITY FUNDAS

THE PASCAL DISTRIBUTION: THE BASIS IS BERNOULLI TRIALS. LOGICAL EXTENSION OF GEOMETRIC DISTRIBUTION. IN THIS CASE THE RANDOM VARIABLE x DENOTES THE TRIAL ON

WHICH THE rth SUCCESS OCCURS WHERE r IS AN INTEGER.

p(x)= =(x-1 r-1)pr.(1-p)x-r FOR x= r,r+1.r+2

HERE THERE MUST BE r-1 SUCCESSES IN X-1 TRIALS & THE FRIST TERM IN THE ABOVE EXPRESSION JUST REFLECTS THAT.

HEREµ = r/p & σ2 = rq/p2.

Page 49: Probability Fundas

PROF. NAVEEN BHATIA 49

PROBABILITY FUNDAS

THE MULTINOMIAL DISTRIBUTION: ASSUME AN EXPERIMNT WHERE THE SAMPLE SPACE IS

PARTITIONED INTO k MUTUALLY EXCLUSIVE EVENTS SAY B1;B2..BK.

WE CONSDIER n INDEPENDENT REPETITIONS OF THE EXPERIMENT &LET pi =P(Bi) BE CONSTANT FROM TRIAL TO TRIAL.

THE RANDOM VECTOR { X1,X2,X3..XK} HAS THE FOLLOWING DISTRIBUTION WHERE Xi IS THE NUMBER OF TIMES Bi OCCURS IN THE n REPETITIONS.

P( x1, x2,x3.. Xk)= [n!/[ x1!.x2!.x3!..xk!]p1x1.p2x2…pkxk

HEREΣXi=n FOR ANY n REPETITIONS. E(Xi)= npi V(Xi)= npi(1-pi)

Page 50: Probability Fundas

PROF. NAVEEN BHATIA 50

PROBABILITY FUNDAS

THE EXPONENTIAL DISTRIBUTION: f(x)= λe-λx for x≥0 =0 otherwise. Λ IS A POSITIVE REAL CONSTANT. POISSON DISTRIBUTION : TALKS ABOUT NUMBER OF OCCURENCES IN

TIME t &

P(x)= e-λt.(λt)x/x! WHATS THE PROBABILITY THAT 0 OCCURRENCE OCCUR IN TIME ti.e p(0)=

e-λt. ANOTHER WAY TO LOOK AT THIS IS PROBABILITY THAT FIRST OCCURRENCE OCCUR AT TIME T>t I.e

P(0)=p(T>t)= e-λt. IF T IS CONSIDERED AS RANDOM VARIABLE WHICH DENOTES THE TIME TO OCCURRENCE, then

F(t){ CUMULATIVE PROB FN.) P(T≤t) = 1- e-λt. f(t)= WHICH IS DERIVATIVE OF F(t) IS THE PROB. DENSITY FN.=λe-λt

Page 51: Probability Fundas

PROF. NAVEEN BHATIA 51

PROBABILITY FUNDAS

RELATIONSHIP BETWEEN EXPONENTIAL AND POISSON DISTRIBUTION:

IF THE NUMBER OF OCCURENCES HAS A POISSON DISTRIBUTION, THEN THE TIME BETWEEN OCCURENCES HAS AN EXPONENTIAL DISTRIBUTION. FOR EXAMPLE IF THE NUMBER OF ORDERS RECEIVED PER WEEK HAS A POISSON DISTRIBUTION THEN THE TIME BETWEEN ORDERS WOULD HAVE AN EXPONENTIAL DISTRIBUTION.

0

∞ ∫λe-λxdx=-e-λx 0

∞ =1.

E(X)= 0 ∞ ∫x. λe-λxdx=-xe-λx

0 ∞ +∫e-λxdx=1/λ

V(x)=1/λ2

Page 52: Probability Fundas

PROF. NAVEEN BHATIA 52

PROBABILITY FUNDAS

f(x)

x

λ

EXPONENTIAL DENSITY FUNCTION

Page 53: Probability Fundas

PROF. NAVEEN BHATIA 53

PROPBABILITY FUNDAS

1

F(X)

xDISTRIBUTION FUNCTION 1-e-λx

Page 54: Probability Fundas

PROF. NAVEEN BHATIA 54

PROBABILITY FUNDAS

NORMAL DISTRIBUTION

SYMMETRICAL BELL SHAPED DISTRIBUTION IT WAS FIRST PRESENTED IN MATHEMATICAL FORM IN1733 BE

DEMOIVERE, WHO DERIVED IT AS LIMITING FORM OF THE BINOMIAL DISTRIBUTION.

ITS ALSO KNOWN AS GUASSIAN DISTRIBUTION ( THROUGH HISTORICAL ERROR)

A RANDOM VARIABLE X IS SAID TO HAVE A NORMAL DISTRIBUTION WITH MEAN µ (-∞<µ<∞)& VARIANCE σ2. IT HAS A

DENSITY FUNCTION f(x)= [1/σ.√2π].e-(1/2)[(x-µ)/σ]2

Page 55: Probability Fundas

PROF. NAVEEN BHATIA 55

PROBABILITY FUNDAS

NORMAL DISTRIBUTION ITS USED SO EXTENSIVELY THAT THE SHORTHAND NOTATION X

IS NORMALLY DISTRIBUTED WITH MEANµ &VARIANCEσ2.

X IS REPRESENRTED BY N(µ,σ2) IT HAS THE FOLLOWING IMPORTANT PROPERTIES

-∞ ∞ ∫f(x)dx=1

f(x)≥0 for all x f(x)=0 as x approaches ∞ or -∞ f(x+µ)=f(-(x-µ)] ;THE DENSITY IS SYMETTRIC ABOUT µ The maximum value of f occurs at x=µ The points of inflection of f are at x=µ +- σ

Page 56: Probability Fundas

PROF. NAVEEN BHATIA 56

PROBABILITY FUNDAS

NORMAL DISTRIBUTIONf(x)

Page 57: Probability Fundas

PROF. NAVEEN BHATIA 57

PROBABILITY FUNDAS

F(x)= P(X≤x) = P(Z≤(X-µ)/σ)=

-∞ (X-µ)/σ∫ [1/√2π].e-(1/2)[(Z]2.

Z HERE IS DEFINED AS STANDARD NORMAL DISTRIBUTED VARIABLE. THIS HAS MEAN 0 AND VARIANCE OF 1.

φ(Z)= 1/√2π.[e-z^2/2] - ∞<z<∞ .DENSITY FUNCTION

Φ(Z)= -∞ Z ∫[1/√2.π]. .[e-z^2/2]dz. CUMULATIVE FUNCTION

THE STANDARD NORMAL DISTRIBUTION IS WELL TABULATED. EXAMPLE: SUPPOSE X HAS A NORMAL DISTRIBUTION N(100,4) , WE

WISH TO EVALUATE F(104) I.E (P(X≤104)= Φ((104-100)/2)=Φ(2)=0.9772. ΦMEASURES THE DEPARTURE OF X FROM THE MEAN IN STD. DEVIATION

UNITS.IN OUR EXAMPLE 104 IS TWO STANDARD DEVIATIONS AWAY FROM 100.

Page 58: Probability Fundas

PROF. NAVEEN BHATIA 58

PROBABILITY FUNDAS

φ(x)

z 0

Page 59: Probability Fundas

PROF. NAVEEN BHATIA 59

PROBABILITY FUNDAS

SYMMETERIC INTERVALS: P{(µ - 1σ) ≤x ≤ (µ +1σ)} =.6826 P{(µ - 1.645σ) ≤x ≤ (µ +1.645σ)} =.90 P{(µ - 1.96σ) ≤x ≤ (µ +1.96σ)} =.95 P{(µ - 2.57σ) ≤x ≤ (µ +2.57σ)} =.99 P{(µ - 3σ) ≤x ≤ (µ +3σ)} =.9978 REPRODUCTIVE PROPERTY OF NORMAL DISTRIBUTION: SUPPOSE WE HAVE n INDEPENDENT NORMAL VARIABLES

X1,X2,X3..Xn; WHERE Xi- N(µi, σi2),& Y=X1+X2+X3…XN

E(Y)=1 n ΣµI & V(Y)= 1

n Σσi2

Page 60: Probability Fundas

PROF. NAVEEN BHATIA 60

PROBABILITY FUNDAS

NORMAL DISTRIBUTION: THE NORMAL DISTRIBUTION CAN BE INTERPRETED IN DIFFERENT WAYS. ONE OF THE INTERPRETATION IS IT’S THE CONTINUOUS ANALOG OF

BINOMIAL DISTRIBUTION WITH p=1/2.

Page 61: Probability Fundas

PROF. NAVEEN BHATIA 61

BINOMIAL & B&S CONVERENCE

CENTRAL LIMIT THEOREM: IF A RANDOM VARIABLE Y IS THE SUM OF n INDEPENDENT

RANDOM VARIABLES WHICH SATISFY CERTAIN GENERAL CONDITIONS, THEN FOR SUFFICIENTLY LARGE n, Y IS APPROXIMATELY NORMALLY DISTRIBUTED.

X1,X2,X3..Xn IS A SEQUENCE OF n INDEPENDENT VARIABLES WITH E(Xi)= μi & V(Xi)=σ2i AND Y= X1+X2+X3+…Xn, THEN UNDER SOME GENERAL CONDITIONS

Zn= [Y- ∑μi]/∑σ2i HAS APP. N(0,1) DISTRIBUTION AS n APPROACHES INFINITY

IF ALL μ’s ARE SAME AND ALLσ’s ARE SAME THEN Zn= (Y- nμ)/σ√n HAS APPROXIMATELY N(0,1)

Page 62: Probability Fundas

PROF. NAVEEN BHATIA 62

BINOMIAL & B&S CONVERENCE

CENTRAL LIMIT THEOREM: HOW LARGE n MUST BE TO GET REASONABLE RESULTS USING

THE NORMAL DISTRIBUTION TO APPROXIMATE Y. THE ANSWER DEPENDS UPON THE DISTRIBUTION OF Xi’s.

THUMB RULES: IF THE DISTRIBUTION OF Xi’s DOESN’T RADICALLY DEPART FROMNORMAL DISTRIBUTION THEN n>=4.

IF Xi’s ARE UNIFORM DENSITY THEN n>=12 ILL BEHAVED, THE ISTRIBUTION HAS MEASURE IN TAILS THEN

n>=100.

Page 63: Probability Fundas

PROF. NAVEEN BHATIA 63

BINOMIAL & B&S CONVERENCE

NORMAL APROXIMATION OF BINOMIAL DISTRIBUTION: THE NORMAL APPROXIMATION TOBINOMIAL DISTRIBUTION:

p(x)= n!/[x!]*[n-x]!pxqn-x

[X-np]/[√npq] HAS N(0,1) DISTRIBUTION IF p is CLOSE TO ½ AND n>10.

HOWEVER FOR OTHER VALUES OF p , n MUST BE FAIRLY LARGE SO THAT np>5 IF p<=1/2 or nq>5 WHEN p>1/2

NOW LETS LOOK AT A BINOMIAL TREE IN THE NEXT SLIDE WHICH CLEARLY SHOWS THAT VARIANVE AND MEAN INCREASE PROPORTIONATELY WITH TIME.

Page 64: Probability Fundas

PROF. NAVEEN BHATIA 64

BINOMIAL & B&S CONVERENCE

NORMAL APROXIMATION OF BINOMIAL DISTRIBUTION: THE FACT THAT THE MEAN AND VARIANCE ARE PROPORTIONAL

TO TIME AND THE STD. DEVIATIONIS PROPORTIONAL TO SQUARE ROOT OF TIME IS NOT UNIQUE TO BINOMIAL OUTCOMES .

IN FACT THIS PROPERTY WILL HOLD GOOD FOR ANY PROBABILITY DISTRIBUTION OFRETURNS PROVIDED:

THE RETURNS FROM ONE PERIOD TO THE NEXT ARE INDEPEDNENT.

SECOND THE MEAN AND VARIANCE OF ONE PERIOD RETURNS MUST BE STATIONARY.

AS SUCH RETURNS MUST BE DRAWN FROM THE SAME PROBABILITY DISTRIBUTION IN EACH PERIOD.

Page 65: Probability Fundas

PROF. NAVEEN BHATIA 65

BINOMIAL & B&S CONVERENCE

NORMAL APROXIMATION OF BINOMIAL DISTRIBUTION:

IF WE USE

U= EXP(µτ/n +σ√τ/√n√(1-θ)/θ

d= EXP(µτ/n -σ√τ/√n√(1-θ)/θ

θ IS THE TRUE PROBABILITY ASSOCIATED WITH INCREASE INS STOCK PRICE

Page 66: Probability Fundas

PROF. NAVEEN BHATIA 66

BINOMIAL & B&S CONVERGENCE

IN THE EQUATIONS SHOWN IN THE PREVIOUS SLIDE IT CAN BE SEEN THAT CONVERGENCE TO B&S VALUE IS FASTEST WHEN

TRUE PROBABILITY= 0.5 AND μa ( CONT COMP RETURN)=ra -0.5σ2

USING THIS VALUE OF μa CAUSES THE RISK NEUTRAL PROBABILITY TO CONVERGE TO 0.5 AT A FASTER RATE.

IN PRACICE IT IS COMMON TO SET μa TO 0 AND Θ=0.5 AND UNDER THESE CONDITIONS THE VALUE OF u& d ARE

u= eσ√T√n & d= e-σ√T/√n

IF u < erT/n t, THEN IT IS ADVISABLE TO USE μa ( CONT COMP RETURN)=ra -0.5σ2 WHILE CALCULATING THE VALUES OF u&d.

Page 67: Probability Fundas

PROF. NAVEEN BHATIA 67

PROBABILITY FUNDAS

Page 68: Probability Fundas

PROF. NAVEEN BHATIA 68

PROBABILITY FUNDAS

NORMAL DISTRIBUTION: THE NORMAL DISTRIBUTION CAN BE INTERPRETED IN DIFFERENT WAYS. ONE OF THE INTERPRETATION IS IT’S THE CONTINUOUS ANALOG OF

BINOMIAL DISTRIBUTION WITH p=1/2.

Page 69: Probability Fundas

PROF. NAVEEN BHATIA 69

LOGNORMAL DISTRIBUTION

LOGNORMAL DISTRIBUTION: SIMPLEST FORM OF DENSITY FUNCTION OF A VARIABLE

WHOSE LOGARITHM FOLLOWS A NORMAL PROBABILITY DISTRIBUTION

RANDOM VARIABLE X WITH RANGE SPACE Rx: [x:0<x<∞] Y=Logex=ln x IS NORMALLY DISTRIBUTED WITH MEAN μy

AND VARIANCE σ2Y.

E(X)=μx=eμy+0.5(σ)2y

σ2x=e2μy +σ2y (eσ2y-1)

Page 70: Probability Fundas

PROF. NAVEEN BHATIA 70

LOGNORMAL DISTRIBUTION

LOGNORMAL DISTRIBUTION: A LOGNORMAL DISTRIBUTION IS BOUNDED BY ZERO BELOW AND IS SKEWED TO

THE RIGHT. A LOGNORMAL DISTRIBUTION IS USEFUL FOR DESCRIBING THE PRICES FOR MANY

FINANCIAL ASSETS AND A NORMAL DISTRIBUTION IS OFTEN A GOOD APPROXIMATION FOR RETURNS.

A LOGNORMAL DISTRIBUTION IS DEFINED BY MEAN AND VARIANCE WHICH IN TURN ARE DERIVED FROM MEAN AND VARIANCE OF ITS ASSOCIATED NORMAL DISTRIBUTION.

WHEN σ INCREASES THE MEAN OF LOG NORMAL INCREASES, IT CAN SPREAD OUTWARDS BUT IT CANT SPREAD BEYOND ZERO, THEREFORE IT MEANS INCREASES.

A NORMAL DISTRIBUTION IS A CLOSER FIT FOR QUARTERLY AND TEARLY HOLDING RETURNS THAN IT IS FOR DAILY OR WEKLY RETURNS.

A NORMAL DISTRIBUTION IS LESS SUITABLE FOR ASSET PRICES SINCE THEY CANT FALL BELOW ZERO.

Page 71: Probability Fundas

PROF. NAVEEN BHATIA 71

LOGNORMAL DISTRIBUTION

LOGNORMAL DISTRIBUTION: LN (ST/S0)= r(0,t) REPRESENTS THE CONTINUOUS COMPOUNDED RETURN. ST/S0= (ST/ST-1)*(ST-1)/(ST-2)…. (S1/S0). TAKING LOG ON BOTH SIDES R(0,T)= r(T-1,T)+r(t-2,t-1)…+r(0,1) USING CONTINUOS COMPUNDED RETURNS . HOLDING PERIOD RETURNS INVOLVE

ADDITION OF SMALLER PERIODS CONT. COMP RETURNS. ASSUMING THAT ONE PERIODS RETURNS ARE INDEPENDEDNT AND IDENTICAL(IID),

THEN IT CAN BE SHOWN THAT RETURNS GET ADED UP AND VARIANCE GETS ADDED UP.

Page 72: Probability Fundas

PROF. NAVEEN BHATIA 72

LOGNORMAL DISTRIBUTION

LOGNORMAL DISTRIBUTION: Ln(St/S0) WHICH REPRESENTS THE CONTINUOUS

COMPOUNDED RETURNS IS NORMALLY DISTRIBUTED WITH MEAN =(μ-1/2 σ2)T AND STD DEVIATION=σ√T

A VARIABLE LIKE STOCK PRICE WHICH CAN TAKE VALUE BETWEEN 0 AND INFINITY IS LOG NORMALLY DISTRIBUTED.

(1+ STOCK RETURN) CAN TAKE VALUE BETWEEN 0&INFINITY IS LOGNORMALLY DISTRIBUTED. THE LN OF(1+STOCK RETURN)= LN(St+1/St) WHICH REPRESENTS CONTINUOUS COMPOUNDED RETURNS IS NORMALLY DISTRIBUTED.

Page 73: Probability Fundas

PROF. NAVEEN BHATIA 73

LOGNORMAL DISTRIBUTION

LOGNORMAL DISTRIBUTION: .

0

Page 74: Probability Fundas

PROF. NAVEEN BHATIA 74

LOGNORMAL DISTRIBUTION

CONSIDER A STOCK WITH AN INITIAL PRICE OF $40 AN EXPECTED RETURN OF 16% PER ANNUM AND A VOL4.035OF 20% PER ANNUM.

LN(St)=Φ[ln40+[(.16-(0.2)^/2)]*0.5 ; .20*√0.5 REPRESENTS THE DISTRIBUTION OF STOCK PRICE 6 MONTHS FROM NOW

Ln(St)=Φ(3.759,.041) THERE IS A 95% PROBABILITY THAT A NORMALLY

DISTRIBUTED VARIABLE HAS A VALUE BETWEEN 1.96 STD DEVIATION OF MEAN. THIS MEANS THAT

3.759-1.96*.141 < LN(ST)<3.759+1.96*.141 OR e3.482 <St <e4.035 or 32.55<ST<56.56

Page 75: Probability Fundas

PROF. NAVEEN BHATIA 75

LOGNORMAL DISTRIBUTION

LOGNORMAL DISTRIBUTION: NORMAL DISTRIBUTION HAS ADDITIVE REPRODUCTIVE

PROPERTIES LOGNORMAL DISTRIBUTION HAS MULTIPLICATIVE

REPRODUCTIVE PROPERTIES. IF X1 &X2 ARE INDEPENDENT LOGNORMAL VARIABLES

WITH PARAMETERS (μY1, σ2Y1 ) &( μY2, σ

2Y2), THEN W=X1.X2

HAS A LOGNORMAL DISTRIBUTION WITH MEAN (μY1+ μY2) &

VARIANCE=(σ2Y1+ σ2

Y2)

Page 76: Probability Fundas

PROF. NAVEEN BHATIA 76

LOGNORMAL DISTRIBUTION

LOGNORMAL DISTRIBUTION: IF X1,,X2…..XN ARE INDEPENDENT LOGNORMAL VARIATES

AND EACH ONE HAS THE SAME PARAMETERS (μY, σ2Y),

THEN GEOMETRIC MEAN [X1.X1.X3….XN][1/N] HAS A LOGNORMAL DISTRIBUTION WITH MEAN μY AND

VARIACE=σ2Y/n LETS LOOK AT AN EXAMPLE: Y=Ln(x) HAS A N(10,4) DISTRIBUTION THEN X HAS A LOG

NORMAL DISTRIBUTION WITH MEAN=e(10+2)=162754.79and

VARIANCE=e(24) *(e4-1)=54.598e24.

P(X<1000)=P(lnx <ln1000)=P(Y<=LN 1000)OR P(Z<=(LN1000-10)/2=P(Z<-1.55)=.0606

Page 77: Probability Fundas

PROF. NAVEEN BHATIA 77

LOGNORMAL DISTRIBUTION

LOGNORMAL DISTRIBUTION: A LOGNORMAL DISTRIBUTION ALLOWS FOR MORE UPSIDE

MOVEMENT. I.E IF RS.100 CONTINUOUS COMPOUNDED AT 12% P.A WOULD LEAD TO 112.75( +12.75) AND 12% DOWNMOVEMENT WOULD LEAD TO 88.69( -11.31)

INOTHER WORDS FOR THE SAME UP AND DOWN LEVELS (SAY +-10%) WE REQUIRE MORE MOVEMENTS PROPORTIONAL OR CC) ON THE DOWN SIDE THAN ON THE UPSIDE.

A 10% OTM CALL WILL BE MORE EXPENSIVE THAN A 10% OTM PUT.LESSER MOVEMENTS ARE REQUIRED TO CROSS THE STRIKE PRICE FOR A 10% OTM CALL THAN A 10% OTM PUT.

Page 78: Probability Fundas

PROF. NAVEEN BHATIA 78

PROBABILITY FUNDAS

SAMPLING & DESCRIPTIVE STATISTICS: STATISTICS IS A SCIENCE OF DRAWING CONCLUSION ABOUT A

POPULATION BASED ON ANALYSIS OF DATA FROM THAT POPULATION. MEASURES OF CENTRAL TENDENCY: MEAN:[ ΣXi]/n: MEAN IS AVERAGE OF ALL AND THERFORE IS IMPACTED BY

EXTREME VALUES. MEDIAN: MEDIAN IS THE VALUE OF n+1/2 IF n IS ODD. IF n=101 MEDIAN IS

51ST OBSERVATION AND IF n=100 MEDIAN IS 50.5 OBSERVATION. THE median IS NOT INFLUENCED BY EXTREME OBSERVATIONS.

MODE: MOST FREQUENT OBSERVATION. IF THE DATA IS SYMMETRIC THEN BOTH THE MEAN & MEDIAN COINCIDE. IN ADDITION IF THE DATA HAS ONLY ONE MODE , THEN THE

MEAN ,MEDIAN AND MODE ALL COINCIDE.

Page 79: Probability Fundas

PROF. NAVEEN BHATIA 79

PROBABILITY FUNDAS

STATISTICS & SAMPLING DISTRIBUTIONS: IF X1, X2,X3…Xn IS A RANDOM SAMPLE OF SIZE n , THEN THE SAMP[LE

MEAN Xµ , THE SAMPLE VARIANCE S2 ARE ITS STATISTICS.

SINCE THE STATISTIC IS A FUNCTION OF THE DATA FROM A RANDOM SAMPLE IT’S A RANDOM VARIABLE.

IN OTHER WORDS IF WE TAKE TWO DIFFERENT SAMPLES AND COMPUTE THEIR MEANS, THE MEAN WOULD BE DIFFERENT.

THE PROCESS OF DRAWING CONCLUSIONS ABOUT POPULATIONS BASED ON SAMPLE DATA MAKE CONSIDERABLE USE OF STATISTICS.

IN GENERAL WE CALL THE PROBABILITY DISTRIBUTION OF A STATISTIC A SAMPLING DISTRIBUTION.

LETS LOOK AT THE SAMPLING DISTRIBUTION OF MEAN Xµ.

HERE THE POPULATION MEAN IS µ & σ2.

NOTE THAT EACH OBSERVATION IN THE RANDOM SAMPLE IS NORMALLY DISTRIBUTED WITH MEAN µ AND VARIANCE σ2..

Page 80: Probability Fundas

PROF. NAVEEN BHATIA 80

PROBABILITY FUNDAS

STATISTICS & SAMPLING DISTRIBUTIONS: USING THE REPRODUCTIVE PROEPRTY OF NORMAL DISTRIBUTION, THE

MEAN OF SAMPLE WILL BE 1/n[ µ+µ+µ……µ{ntimes}= µ THE VARIANCE OF THE SAMPLE MEAN THAT IS VARIANCE OF Xµ WILL BE

VAR (Xµ ) = VAR[(1/n){x1+x2+x3…..xn}]= 1/n2 .[ σ 2 +σ 2 +σ 2…….σ2(ntimes)]

= 1/n2 .[nσ2] =[σ2/n] THEREFORE THE DISTRIBUTIONOF Xµ IS NORMAL WITH MEAN µ AND

VARIANCE σ2 /n .. THE TERM √[σ2 /n ] IS ALSO REFERRED TO AS THE STANDARD ERROR

Page 81: Probability Fundas

PROF. NAVEEN BHATIA 81

PROBABILITY FUNDAS

ESTIMATORS: SUPPOSE THAT X IS A RANDOM VARIABLE WITH MEAN µ & VARIANCE σ2.

WE CAN SHOW THAT THE SAMPLE MEAN Xµ AND SAMPLE VARIANCE S2 ARE UNBIASED ESTIMATORS OF µ & VARIANCE σ2.

E(Xµ)= E[ΣXi/n] ( i varies from 1 to n).

=1/n . E[ΣXi] ( i varies from 1 to n).

=1/n .ΣE(Xi) I varies from 1 to n

=1/n Σµ i varies from 1 to n = µ SIMILARLY IT CAN BE SHOWN THAT E(S2)=σ2.

Page 82: Probability Fundas

PROF. NAVEEN BHATIA 82

PROBABILITY FUNDAS

E(S2)=E[Σ(Xi-xµ)2/(n-1)]

[1/n-1]. E Σ(Xi-xµ)2

[1/n-1]. E Σ(Xi2 +xµ2 +2xµxi)

[1/n-1]. E Σ(Xi2 -n xµ2)

=[1/n-1]. Σ(E(Xi2) –n.E (xµ2))

=E(Xi2)=µ2+ σ2 E(Xµ2)= µ2+ σ2 /n.

=[1/n-1]. Σ µ2+ σ2 –n.( µ2+ σ2 /n)=[1/n-1].[nµ2 +nσ2 -nµ2-σ2] =σ2.

THEREFORE THE SAMPLE VARIANCE IS AN UNBIASED ESTIMATE OF POP

Page 83: Probability Fundas

PROF. NAVEEN BHATIA 83

PROBABILITY FUNDAS

CONFIDENCE INTERVAL ESTIMATION: IN MANY SITUATIONS A POINT ESTIMATE DOESN’T PROVIDE ENOUGH

INFORMATION ABOUT THE PARAMETER OF INTEREST. AN INTERVAL ESTIMATE OF THE FORM L ≤µ≤U . THE END POINTS OF THE

INTERVAL WILL BE RANDOM VARIABLE. P(L≤θ≤U)=1-α. THIS INTERVAL IS CALLED 100(1-α) PERCENT CONFIDENCE

INTERVAL FOR THE UNKNOWN PARAMETER θ. L& U ARE CALLED LOWER AND UPPER CONFIDNCE LIMITS. (1-α) IS CALLED THE CONFIDENCE COEFFICIENT. WE SAY THAT θ LIES IN THE OBSERVED INTERVAL WITH CONFIDENCE

100*(1-α). THE ABOVE STATEMENT HAS A FREQUENCY INTERPRETATION, I.E WE

DON’T KNOW IF THE STATEMENT IS TRUE FOR THIS SPECIFIC SAMPLE , BUT THE METHOD USED TO OBTAIN THE INTERVAL [L,U] YIELDS CORRECT STATEMENTS 100.(1-α) PERCENT OF THE TIME.

Page 84: Probability Fundas

PROF. NAVEEN BHATIA 84

PROBABILITY FUNDAS

CONFIDENCE INTERVAL ESTIMATION: THE CONFIDENCE INTERVAL DISCUSSED EARLIER IS CALLED TWO-SIDED

CONFIDENCE INTERVAL AS IT SPECIFIES BOTH A LOWER AND ANUPPER LIMIT ON θ. OCCASIONALLY , A ONE SIDED CONFIDENCE INTERVAL MIGHT BE MORE

APPROPRIATE. A ONE SIDED 100.(1-α) PERCENT LOWER CONFIDENCE INTERVAL ON θ IS GIVEN BY THE INTERVAL L ≤θ, WHERE THE LOWER CONFIDENCE LIMIT L IS CHOSEN SO THAT

P(L≤θ)= 1-α, SIMILARLY ON THE UPPER SIDE WE CAN HAVE P(θ≤U)=1-α. THE LENGTH OF THE CONFIDENCE INTERVAL IS AN IMPORTANT MEASURE OF THE

QUALITY OF THE INFORMATION OBTAINED FROM THE SAMPLE. THE LONGER THE INTERVAL , MORE CONFIDENT WE ARE THAT THE INTERVAL

ACTUALLY CONTAINS THE TRUE VALUE OF θ. ON THE OTHER HAND THE LONGER THE INTERVAL, THE LESS INFORMATION WE

HAVE ABOUT THE TRUE VALUE OF θ. IN AN IDEAL SITUATION , WE OBTAIN A RELATIVELY SHORT INTERVAL WITH HIGH

CONFIDENCE.

Page 85: Probability Fundas

PROF. NAVEEN BHATIA 85

PROBABILITY FUNDAS

CONFIDENCE INTERVAL ESTIMATION:

α/2 α/2

Page 86: Probability Fundas

PROF. NAVEEN BHATIA 86

PROBABILITY FUNDAS

CONFIDENCE INTERVAL ESTIMATION: LET X BE A RANDOM VARIABLE WITH UNKNOWN MEAN µ & VARIANCE σ2.

SUPPOSE THAT A RANDOM SAMPLE OF SIZE n x1,x2,x3…xn IS TAKEN. 100. (1-α) PERCENT CONFIDENCE INTERVALON µ CAN BE OBTAINED BY

CONSIDERING THE SAMPLING DISTRIBUTION OF SAMPLE MEAN Xµ.

THE MEAN OF Xµ IS µ & THE VARIANCE IS σ2/n. THEREFORE, THE DISTRIBUTION OF THE STATISTIC

Z=[Xµ -µ]/[σ/√n] IS TAKEN TO BE STANDARD NORMAL

DISTRIBUTION.

P(-Zα/2≤ Z ≤Zα/2)=1-α

P(-Zα/2≤[Xµ -µ]/[σ/√n] ≤Zα/2)=1-α

ORP[( Xµ -(Zα/2). [σ/√n] ≤ µ≤ ( Xµ +(Zα/2). [σ/√n] )]=1-α

Page 87: Probability Fundas

PROF. NAVEEN BHATIA 87

PROBABILITY FUNDAS

CONFIDENCE INTERVAL ESTIMATION: EXAMPLE: A QUALITY INSPECTOR IS INVESTIGATING THE INTERNAL

PRESSURE STRENGTH OF A 1 LITRE , GLASS SOFT DRINK BOTTLE. PRESSURE STRENGTH IS NORMALLYDISTRIBUTED WITH STD. DEVIATION

OF 30 PSI. A RANDOM SAMPLE OF 25 BOTTLES HAD A MEAN PRESSURE OF 278 PSI . A 95% CONFIDENCE INTERVAL FOR µ IS

[278-1.96.30/5]≤µ≤[278+1.96.30/5]=[266.24≤µ≤289.76]. WE CAN SAY WITH 95% CONFIDENCE THAT POPULATION MEAN WILL BE

WITHIN THE GIVEN RANGE. IN SITUATIONS WHERE THE SAMPLE SIZE CAN BE CONTROLLED WE CAN

CHOOSE n TO BE 100.(1-α) PERCENT SUCH THAT ERROR IN ESTIMATING µ IS LESS THAN A SPECIFIED ERROR E.. THE APPROPRIATE SAMPLE SIZE IS

N= [zα/2.σ/E]2

Page 88: Probability Fundas

PROF. NAVEEN BHATIA 88

PROBABILITY FUNDAS

CONFIDENCE INTERVAL ESTIMATION: CONFIDENCE INTERVAL ON THE MEAN OF A NORMAL DISTRIBUTION

VARIANCE UNKNOWN

t= =[Xµ -µ]/[S/√n] IS A t DISTRIBUTION WITH n-1 degrees OF

FREEDOM.

P(-tα/2,n-1≤[Xµ -µ]/[s/√n] ≤tα/2,n-1)=1-α

Page 89: Probability Fundas

PROF. NAVEEN BHATIA 89

PROBABILITY FUNDAS

CONFIDENCE INTERVAL ESTIMATION: CONFIDENCE INTERVAL ON THE VARIANCE OF A NORMAL DISTRIBUTION X IS ANDOM VARIABLE WITH UNKNOWN MEAN & VARIANCE Χ2 =[(n-1).S2]/σ2 IS Χ2 DISTRIBUTION WITH n-1 DEGREES OF FREEDOM.

P(Χ2 1-α/2, n-1 ≤[(n-1)S2]/σ2 ≤ Χ2

α/2, n-1 )=1-α

THIS CAN BE REARRANGED AS P({n-1}s2/Χ2

α/2, n-1 ≤σ2 ≤ ({n-1}s2/ Χ2 1-α/2, n-1 )=1-α

α/2α/2

Χ2n-1 DISTRIBUTION

0

Χ21-α/2,n-1

Χ2α/2,n-1

Page 90: Probability Fundas

PROF. NAVEEN BHATIA 90

PROBABILITY FUNDAS

TESTS OF HYPOTHESIS: MANY PROBLEMS REQUIRE THAT WE DECIDE WHETHER OR NOT

A STATEMENT ABOUT SOME PARAMETER IS TRUE OR FALSE. THE STATEMENT IS USUALLY CALLED A HYPOTHESIS AND THE

DECISION MAKING PROCEDURE ABOUT THE TRUTH OR FALSITY OF THE HYPOTHESIS ISCALLED HYPOTHESIS TESTING.

A STATISTICAL HYPOTHESIS IS A STATEMENT ABOUT THE PROBABILITY DISTRIBUTION OF A RANDOM VARIABLE.

SUPPOSE WE ARE INTERESTED IN THE MEAN COMPREHENSIVE STRENGTH OF A PARTICULAR TYPE OF CONCRETE.

WE ARE INTERESTED IN SAYING WHETHER THE MEAN IS 2500 psi. WE MAY EXPRESS THIS FORMALLY AS

H0: µ= 2500 psi H1: µ≠ 2500 psi

Page 91: Probability Fundas

PROF. NAVEEN BHATIA 91

PROBABILITY FUNDAS

TESTS OF HYPOTHESIS: H0 is called the null hypothesis. H1: IS CALLED THE ALTERNATIVE HYPOTHESIS. IN SOME SITUATIONS THE ALTERNATIVE HYPOTHESIS MAY BE

ONE SIDED. I.E H1: µ > 2500 ITS IMPORTANT TO NOTE THAT HYPOTHESIS ARE ALWAYS

STATEMENTS ABOUT POPULATIONS AND NOT ABOUT SAMPLES. SUPPOSE WE TEST THE SAMPLE AND WE SAY THAT WE WILL

REJECT H0 IF Xµ > 2550 OR Xµ < 2450. THE SET OF ALL VALUES > 2550 OR LESS THAN 2450 IS CALLED THE CRITICAL REGION OR THE REJECTION REGION

THE INTERVAL 2450-2550 IS CALLED THE ACCEPTANCE REGION.

Page 92: Probability Fundas

PROF. NAVEEN BHATIA 92

PROBABILITY FUNDAS

TYPE 1 AND TYPE II ERRORS: H0 IS TRUE H0 IS FALSE ACCEPT H0 NO ERROR TYPE II ERROR REJECT H0 TYPE I ERROR NO ERROR α = P( TYPE I ERROR)= P( REJECT H0/ H0 IS TRUE) Β= P( TYPE II ERROR)= P(ACCEPT H0/ H0 IS FALSE) SOMETIMES ITS MORE USEFUL TO WORK WITH POWER OF THE

TEST . POWER IS DEFINED AS (1-β) =P(REJECT H0/H0 IS FALSE) POWER OF THE TEST IS THE PROBABILITY THAT A FALSE NULL

HYPOTHESIS IS REJECTED CORRECTLY. THE PROBABILITY OF TYPE 1 ERROR IS CALLED THE

SIGNIFICANCE LEVEL OR SIZE OF THE TEST.

Page 93: Probability Fundas

PROF. NAVEEN BHATIA 93

PROBABILITY FUNDAS

TYPE 1 AND TYPE II ERRORS: IN OUR EXAMPLE TYPE 1 ERROR WILL OCCUR IF SAMPLE MEAN

IS > 2550 OR < 2450, WHEN IN FACT THE TRUE MEAN IS 2500. TYPE 1 ERROR PROBABILITY IS CONTROLLED BY THE LOCATION

OF THE CRITICAL REGION. ITS EASY FOR AN ANALYST TO SET THE DESIRED VALUE OF

TYPE I ERROR. WHAT IF THE ACTUAL MEAN IS DIFFERENT FROM 2500 I.E THE

NULL HYPOTHESIS IS FALSE.? THE PROBABILITY OF TYPE II ERROR IS NOT A CONSTANT BUT DEPENDS UPON THE TRUE MEAN COMPREHENSIVE MEAN STRENGTH OF THE CONCRETEµ.

LET β (µ ) DENOTE THE TYPE II ERROR PROBABILITY CORRESPONDING TO µ

Page 94: Probability Fundas

PROF. NAVEEN BHATIA 94

PROBABILITY FUNDAS

NOTE HERE THAT β(2700)<β(2600).IN OTHER WORDS SMALLER DEVIATIONS ARE HARDER TO DETECT THAN LARGER ONES.

IN ADDITION , THE PROBABILITY OF TYPE II ERROR IS ALSO A FUNCTION OF THE SAMPLE SIZE. IT DCREASES AS THE SAMPLE SIZE INCREASES.

FINALLY THE TYPE II ERROR ALSO DEPENDS UPON THE TYPE I ERRORα. DECREASING α CAUSES TYPE II ERROR TO INCREASE. BECAUSE TYPE II ERROR PROBABILITY IS A FUNCTION OF SAMPLE SIZE,

TYPE I ERROR AND TRUE MEAN, WE GENERAALLY SAY , “ WE FAIL TO REJECT H0 “ RATHER THAN SAYING ACCEPTING H0.

MANY HYOTHESIS TESTING PROBLEMS REQUIRE A ONE SIDED TEST SUCH AS

H0: μ=μ0; H1: μ >μ0: THIS WOUL MEAN THAT CRITICAAL REGION IS IN THE UPPER TAIL.THAT IS WE WOULD REJECT H0: IF SAMPLE MEAN IS TOO LARGE.

HERE NOTE THAT IF TRUE MEAN IS >μ0 THEN ONE SIDED TEST IS BETTER. IF TRUE MEAN=μ0 THEN BOTH ONE AND TWO SIED TESTS ARE EQUIVALENT. HOWEVER IF TRUE MEAN IS<μ0, THEN TWO SIDED TEST IS BETTER THAN ONE SIDED TEST.

Page 95: Probability Fundas

PROF. NAVEEN BHATIA 95

PROBABILITY FUNDAS

WHAT ABOUT THE FOLLOWING HYPOTHESIS: H0: μ ≤μ0; H1: μ >μ0 . HERE WE ARE ASSUMING THAT μ CAN BE <μ0. BUT

IT CANT BE GREATER. IN SITUATIONS WHERE ONE SIDED HYPOTHESIS ARE APPROPRIATE, WE

WILL USUALLY WRITE THE NULL HYPOTHESIS WITH AN EQUALITY SIGN WHICH MEANS IT INCLUDES THE CASES WHEREμ< μ0

SUPPOSE A SOFT DRINK BOTTLER PURCHASES BOTTLES. THE BOTTLES WILL BE ACCEPTED ONLY IF PRESSURE STRENGTH IT CAN STAND IS 200PSI.

NOW THE HYPOTHESIS CAN BE FORMULATED IN TWO WAYS. 1: H0:μ≤ 200 H1: μ>200 2. H0:μ ≥ 200 H1: μ <200 IN 1 THERE IS A PROBABILITY THAT H0 WILL BE ACCEPTED THAT

BOTTLES ARE NOT SATISFACTORY , EVEN THOUGH THE TRUE MEAN IS GREATER THAN 200. THIS IMPLIES THAT BOTTLER WANTS THE BOTTLES TO MEET ORE XCEED THE EXPECTATIONS. IN 2 CASE THERE IS PROBABILITY THAT H0 WILL BE ACCEPTED EVEN THOUGH THE TRUE MEAN IS SLIGHTLY LESS THAN 200.

Page 96: Probability Fundas

PROF. NAVEEN BHATIA 96

PROBABILITY FUNDAS

WHAT ABOUT THE FOLLOWING HYPOTHESIS: IN FORMULATING ONE SIDED HYPOTHESIS, WE SHOULD ALWAYS

REMEMBER THAT REJECTING H0 IS ALWAYS A STRONG CONCLUSION AND CONSEQUENTLY WE SHOULD PUT THE STATEMENT ABOUT WHICH IS IMPORTANT TO MAKE A STRONG CONCLUSION IN THE ALTERNATE HYPOTHESIS.

TEST OF HYPOTHESIS ON MEAN ,VARIANCE KNOWN: H0:μ=μ0; H1 μ≠μ0. A SAMPLE IS TAKEN AND Z0=Xμ-μ0/σ/√n.

HERE WE WOULD REJECT H0 IF Z0>Zα/2 AND IF Z0 <Z -α/2 ( REJECTION REGION) AND WE WOULD FAIL TO REJECT H0 IF –Zα/2≤Z0 ≤ Zα/2

FOR EXAMPLE: H0: μ=40 AND H1:μ≠40, n=25 α=.05 and σ=2

Z.025=1.96 AND – Z.025=1.96 Z0= 41.25-40/2/√25=3.125

SINCE Z0 FALLS IN CRITICAL REGION , H0 IS REJECTED.

Page 97: Probability Fundas

PROF. NAVEEN BHATIA 97

PROBABILITY FUNDAS

CHOICE OF SAMPLE SIZE TO CONTROL TYPE II ERROR H0: μ=μ0 AND H1: μ≠μ0 AND LETS ASSUME THAT TRUE MEAN IS

μ0+δ. Z0– N(δ.√n/σ,1) . NOW THE TYPE II ERROR 7-ILL BE MADE, GIVE H1 IS TRUE ONLY IF Z0 LIES BETWEEN (-Zα/2 AND Zα/2)

IN OTHER WORDS β=Φ(Zα/2- δ.√n/σ /)-Φ((-Zα/2- δ.√n/σ /).

TEST OF A HYPOTHESIS MEAN VARIANCE UNKNOWN: HERE THE VARIANCE IS NOT KNOWN TO TEST A HYPOTHESIS H0: μ=μ0 AND H1: μ≠μ0 IN SUCH CASES WE USE TH t –statistic t0 =[Xμ-μ0]/[S/√n] -t DISTRIBUTION WITH n-1 degrees OF FREEDOM

THE OTHER PRINCIPLES USED ARE THE SAME I.E FOR THE CRITICAL REGION ETC.IN OTHER WORDS ,INSTEAD OF USING Z DISTRIBUTION WE USE t WITH n-1 DEGREES OF FREEDOM

Page 98: Probability Fundas

PROF. NAVEEN BHATIA 98

PROBABILITY FUNDAS

TEST OF HYPOTHESIS ON THE VARIANCE OF A NORMAL DISTRIBUTION:

HERE H0:σ 2=σ02 H1: σ 2 ≠σ0

2

HERE WE USE THE (CHI SQUARE TEST STATSTIC WITH n-1 degrees of freedom) )Χ0

2= (n-1).S2/σ0 2.

THEREFOR H0” σ 2=σ02 WILL BE REJECTED IF Χ0

2 >Χ2 α/2,N-1 OR IF

Χ0 2 <Χ2

1- α/2,N-1

LETS CONSIDER AN EXAMPLE H0:σ 2=.02 AND H1: σ 2 >.02 . A RANDOM SAMPLE OF 20 CANS YIELD A SAMPLE VARIANCE OF .0225. THUS THE TEST STATISTIC IS Χ0

2 =(19)*.0225/.02=21.38. IF WE CHOOSE α= .05, WE FIND THAT Χ0

2 .05,19 =30.14. WE WILL CONCLUDE THAT THERE IS NO STRONG EVIDENCE THAT THE VARIANCE EXCEEDS .02.

Page 99: Probability Fundas

PROF. NAVEEN BHATIA 99

PROBABILITY FUNDAS

SIMPLE LINEAR REGRESSION: INDEPEPENDENT VARIABLE x and INDEPENDENT VARIABLE y. y=β0 + β1.x +ε. HERE ε IS A RANDOM ERROR WITH MEAN 0 AND

VARIANCEσ2. FURTHER THE ERROR TERMS ARE ASSUMED TO BE UNCORRELATED.

LEAST SQUARE MODEL:SUM OF SQUARE OF DEVIATIONS BETWEEN THE ACTUAL OBSERVATIONS AND THE ONES ESTIMATED BY THE REGRESSION LINE IS MINIMISED.

I.E L=∑εi2 =∑(yi - β0 + β1.x )2. IF WE REWRITE y=β0’ +β1.(x-xμ)+ε. HERE β0’ = β0 + β1.xμ)

WHEN WE DIFFERENTIATE THE EQUATION WRT β0’ AND β1 WE GET ∂L/∂β’0=-2∑[yi- β’0 –β1(x-xμ)]=0 and

∂L/∂β1= =-2∑[yi- β’0 –β1(x-xμ). (x-xμ).]=0

SIMPLIFYING FIRST EQUATION WE GET nβ’0= ∑yi or β’0=yμ (mean of y) or finally we are saying that β0= yμ –xμ.β1.

Page 100: Probability Fundas

PROF. NAVEEN BHATIA 100

PROBABILITY FUNDAS

SIMPLE LINEAR REGRESSION: THE SECOND DERIVATIVE LEADS TO THE EQUATION β1.∑(xi-xμ)2 =∑yi.(xi-xμ) or β1= =∑yi.(xi-xμ) / .∑(xi-xμ)2

ASSUMPTIONS THE RELATIONSHIP BETWEEN DEPENDENT VARIABLE AND INDEPENDENT VARIABLE

IS LINEAR. THE EXPECTED VALUE OF RANDOM TERM IS ZERO THE VARIANCE FOR ERROR TERM IS SAME FOR ALL OBSERVATIONS E(ei^2)=σe2

THE ERROR TERM IS UNCORRELATED ACROSS OBSERVATIONS E( eiej)=0 THE ERROR TERM IS NORMALLY DISTIBUTED.

Page 101: Probability Fundas

PROF. NAVEEN BHATIA 101

PROBABILITY FUNDAS

SIMPLE LINEAR REGRESSION: STANDARD ERROR ESTIMATE: =Σ(ei)2/(n-2) TWO PARAMETERS HAVE BEEN ESTIMATED (ALPHA/BETA) N-2 REPRESENTS THE DIFFERENCE BETWEEN THE NUMBER OF OBSERVATIONS

AND NUMBER OF PARAMETES ESTIMATED FROM THESE OBSERVATIONS.

COEFFICIENT OF DETERMINATION (R2) TELLS WHAT PERCENTAGE OF VARIATION IN THE DEPENDENT VARIABLE IS EXPLAINED BY THE INDEPENDENT VARIABLE

IT IS ALSO EQUAL TO =[TOTAL VARIATION –UNEXPLAINED VARIATION]/TOTAL VARIATION.

HYPOTHESIS TESTING: T STATSTIC= [ACTUAL BETA – HYPOTHESISED VALUE OF BETA]/ sbeta FOR EXAMPLE IF ACTUAL BETA= 1.5 AND HYPOTHESISED BETA =1 AND sbi=0.20 THEN T STATSTIC= 0.5/0.2= 2.5. IF n=62 and n-2=60 THEN tc AT 95% CONFIDENCE

INTERVAL (5% SIGNIFICANCE LEVEL) IS 2.00. SINCE THE OBSERVED STASTIC IS 2.5( OUTSIDE THE INTERVAL) , WE REJECT THE

NULL HYPOTHESIS THAT BETA= 1. THE CONFIDENCE INTERVAL= 1.1TO1.90

Page 102: Probability Fundas

PROF. NAVEEN BHATIA 102

PROBABILITY FUNDAS

SIMPLE LINEAR REGRESSION: HYPOTHESIS TESTING: WE CAN ALSO SAY THAT WE ARE MORE THAN 95% CONFIDENT THAT STOCK BETA IS DIFFERENT

FROM 1. IF WE USE A SIGNIFICANCE LEVEL OF 1% THEN T STATSTIC IS 2.66%, THEN WE WILL NOT REJECT

THE NULL HYPOTHESIS.. AT HIGHER LEVEL OF CONFIDENCE ( LOWER LEVEL OF SIGNIFICANCE), THE CONFIDENCE INTERVAL INCREASES AND NOW WE WILL NOT REJECT THE NULL HYPOTHESIS. IT REDUCES THE TYPE I ERROR ( REEJCTING NULL WHEN ITS TRUE) , BUT IT INCREASES TYPE II ERROR , THAT IS FAILING TO REJECT NULL WHEN ITS FALSE.

OFTEN THE FINANCIAL ANALYSTS INDIACE THE p value. THE p vale IS THE SMALLEST VALUE OF SIGNIFICANCE AT WHICH THE NULL HYPOTHESIS CAN BE REJECTED.

IN FACT IN MOST OF THE SOFTWARE THE p value CORRESPONDS TO A TEST OF NULL HYPOTHESIS THAT TRUE VALUE IS EQUAL TO 0 . FOR EXAMPLE IF THE p value IS .005, WE CAN REJECT THE HYPOTHESIS THAT THE TRUE PARAMETER IS EQUAL TO 0 AT THE 0.5% SIGNIFICANCE LEVEL (99.5% CONFIDENCE INTERVAL)

STRONGER REGRESSION RESULTS LEAD TO SMALLER STANDARD ERRORS OF AN ESTIMATED PARAMETER AND RESULT IN TIGHTER CONFIDENCE INTERVALS AND WE WILL REJECT THE NULL HYPOTHESIS EVEN AT HIHGER CONFIDENCE ( LOWER LEVEL OF SIGNIFICANCE)

THE Standard error of beta = SQRT ( SSE/n-2)/(SSX) or THE T STSSTIC= ρ*sqrt(n-2)/ sqrt(1-ρ2).. FOR THE STANDARD ERROR OF BETA IS AVERAGE STD

ERROR PER DEGREE OF FREEDOM (WHICH IS Σe and DENOMINATOR IS SUM OF SQUARES OF INDEPENDENT VARIABLE)

Page 103: Probability Fundas

PROF. NAVEEN BHATIA 103

PROBABILITY FUNDAS

SIMPLE LINEAR REGRESSION: HYPOTHESIS TESTING: THE F-TEST THAT THE SLOPE COEFFICIENT EQUALS ZERO IS BASED ON F-STASTIC. THE F-STATSTIC IS RATIO OF RSS/[SSE/n-2). IT’S THE RATIO OF TOTAL VARIATION OF Y THAT IS

EXPLAINED BY REGRESSION AND VARIANCE OF ERROR. THE F STASTIC HAS F1,n-2 I( ONE INTERCEPT) AND N-2 DEGREES OF FREEDOM.

IF THE REGRESSION MODEL DOES A GOOD JOB THEN THE RATIO SHOULD BE HIGH. THE EXPLAINED REGRESSION SUM OF SQUARES PER ESTIMATED PARAMETER WILL BE

RELATIVELY HIGH TO THE UNEXPLAINED VARIATION FOR EACH DEGREE OF FREEDOM. WHEN THERE IS ONE INDEPENDENT VARIABLE , THE F STASTIC IS SQUARE OF TSTASTIC FOR THE

SLOPE COEFFICIENT.

Page 104: Probability Fundas

PROF. NAVEEN BHATIA 104

PROBABILITY FUNDAS

SIMPLE LINEAR REGRESSION: HYPOTHESIS TESTING: ONE OF THE TEST THAT IS CONDUCTED TO EVALUATE THE PERFORMANCE OF MUTUAL FUND MANAGER IS EXCES

RETURNS WHICH IS ALPHA. HERE THE NULL HYPOTHESIS IS THAT ALPHA=0 AND THE ALTERNATE HYPOTHESIS IS ALPHA IS NOT EQUAL TO ZERO. LETS SEE THE RESULTS OF A REGRESSION MULTIPLE R= 0.9280 RSQUARE= 0.8611; STANDARD ERROR OF ESTIMATE=.0174, OBSERVATIONS=60. ANOVA DEGREES OF FREEDOM SUM OF SQUARES MEAN SS F REGRESSION 1 .1093 .1093 359.64 RESIDUAL 58 .0176 .0003 TOTAL 59 .1269

COEFFICIENT STD ERROR T-STATSTIC ALPHA 0.0009 0.0023 0.4036 BETA 0.7902 0.0417 18.9655 THE VALUE OF ALPHA COEFFICIENT IS ONLY 1/3 RD OF STANDARD ERROR FOR THAT COEFICIENTAND THE T-STASTIC

IS .4036. THEREFORE WE CANT REJECT THE NULL HYPOTHESIS THE p VALUE FOR THE t-stastic is 0.0001. THEREFORE THE PROBABILITY THAT THE TRUE VALUE OF THIS COEFFICIENT

IS ACTUALLY ZERO IS MICROSCOPIC.SIMILARILY THE p VALUE FOR THE F STASTIC IS LESS THAN .0001