50
CAS Individual Claim Simulator Validation Report ReservePrism April 2018 1

1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

CAS Individual Claim SimulatorValidation Report

ReservePrism

April 2018

1

Page 2: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Table of Contents1. Background.............................................................................................................................................3

2. Validation Method..................................................................................................................................3

3. Fitting......................................................................................................................................................4

3.1 Distribution Fitting.............................................................................................................................5

3.2 Copula Fitting...................................................................................................................................13

3.3 Exposure Index................................................................................................................................15

3.4 Report Lag Impact on Frequency.....................................................................................................15

4. Simulation.............................................................................................................................................15

4.1 Open Claim Loss Development........................................................................................................17

4.2 Claim Reopenness............................................................................................................................18

4.3 Distribution......................................................................................................................................19

4.4 Copula..............................................................................................................................................21

4.5 Exposure Index................................................................................................................................22

4.6 LAE...................................................................................................................................................23

4.7 Deductible and Limit........................................................................................................................23

Appendix. Simulation Module Test R Code..............................................................................................25

2

Page 3: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Test Claim Data

CAS Simulator Fitting Module

Report Lag

Settlement Lag

Frequency

Severity

Report Lag, Settlement Lag, and

Severity Copula

Frequency Copula

Distribution

Dependency

Check against assumptions used in the test data

1. BackgroundThis document explains the validation efforts made for the CAS Individual Claim Simulator by the development team. The development team tested the general reasonability of fitting and simulation results as well as individual modeling choices. Test data is generated by the ReservePrism, a validated commercial software for loss fitting and simulation. Blind tests were used so that the tester did not know the assumptions used in the data generation but had to try different distributions/copulas.

While the developers performed many tests and believe in the correctness of the simulator, small errors may still exist. The developers and the sponsor make no guarantee that the simulator is error-free and can meet users’ specific purpose.

2. Validation MethodTwo areas are focused in the validation tests: fitting function and simulation function. For the fitting function, the goal is to make sure that the best fitted distribution and copula are consistent with the assumptions used in test data generation. Test data is fed into the fitting module of the simulator. Fitted results are then compared to the assumptions used in data generation.

Figure 1. Fitting Module Validation Process

For the simulation function, the goal is to make sure the distribution/copula of simulated individual claims follows the simulation assumptions. Simulated data from the CAS Individual Claim simulator are

3

Page 4: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Simulation Assumptions

CAS Simulator Simulation

Module

Report Lag

Settlement Lag

Frequency

Severity

Report Lag, Settlement Lag, and

Severity Copula

Frequency Copula Check against the

simulation assumptions

Simulated Claim Data

R Program

Reopen Probability

Loss Development

LAE

Exposure Index

Severity Index

Deductible

Limit

analyzed using R programs to derive the distribution and copula. Maximum likelihood method (MLE) is used and the results are compared to simulation assumptions.

Figure 2. Simulation Module Validation Process

3. FittingThe fitting module tries to find the best distribution or copula fit based on claim data. In this test, the test data is simulated from ReservePrism, a commercial claim simulation software. The assumptions used in test data generation were kept secrete from the tester. Two business lines’ claim data are simulated: Home and Auto. The following assumptions are used in test data generation.

Table 1. Test Data Assumption

Business Line Home AutoClaim Type Dwelling PDAnnual Frequency Poisson ( = 1200) Poisson ( = 2000)Exposure Index Level 8% Annual IncreaseReport Lag Weibull (shape=9.5, scale=800) Exponential (=0.0109589)Settlement Lag Weibull (shape=6.5, scale=180) Exponential (=0.002739726)Severity Lognormal (=12.1664,

=0.5326)Lognormal (=9, =0.6)

Severity Index Level LevelCorrelation between Severity and Settlement Lag

Frank (alpha=50) Independent

4

Page 5: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Deductible 20000 1000Limit Limit Prob.

200000 37.5%300000 62.5%

Limit Prob.8000 20%15000 30%20000 50%

Frequency Correlation 85%

The test data is then fed into the simulator for fitting using maximum likelihood estimation (MLE). The results are described below.

3.1 Distribution FittingReport lag, settlement lag, frequency and severity are covered in distribution fitting. The fitting module tries a list of distributions to find out the best fit based on criteria such as Akaike information criterion (AIC) and Bayesian information criterion (BIC).

Report Lag

Table 2 lists the validation result for report lag. The fitting module is able to find the correct distribution and estimated distribution parameters are close to the assumptions in Table 1.

Table 2. Report Lag Fitting Result

LoB Distribution

Parameter Standard Deviation1

DoF2 KS Test3

p value

Log likelihood

AIC BIC

Home

Normal mean:128.845; sd:685.675;

NA 9608 0.7 0 -75,717 151,439

151,453

Lognormal meanlog:6.62; sdlog:0.137;

0.0014; 0.001;

9608 0.07 0 -58,192 116,387

116,402

Pareto Fitting UnsuccessfulWeibull shape:9.276;

scale:798.361; 0.0737; 0.9247;

9608 0.01 0.6 -57,366 114,735

114,750

Gamma shape:53.66; scale:14.096;

0.7167; 0.189;

9608 1 0 -57,980 115,963

115,977

Uniform Fitting UnsuccessfulExponential Fitting Unsuccessful

Auto

Normal mean:87.639; sd:86.405;

0.5038; 0.3531;

29410

0.16 0 -173,057 346,117

346,134

Lognormal meanlog:3.89; sdlog:1.277;

0.0074; 0.0053;

29410

0.08 0 -163,297 326,598

326,615

Pareto Fitting UnsuccessfulWeibull Fitting UnsuccessfulGamma shape:0.993;

scale:88.19; 0.0072; 0.8238;

29410

1 0 -160,921 321,846

321,863

Uniform Fitting UnsuccessfulExponential

rate:0.011; 1e-04; 29411

0.01 0 -160,922 321,846

321,854

Notes:1. Standard deviation of the parameter estimation.

5

Page 6: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

2. Degree of freedom, which is the number of data points – number of parameters3. Kolmogorov–Smirnov test statistic. Its p value is listed in the next column.

Figure 1 and Figure 2 compare the data and the chosen distributions for report lag.

Figure 1. Report Lag Fitting for Home Line

Figure 2. Report Lag Fitting for Auto Line

6

Page 7: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Settlement Lag

Table 3 lists the validation result for settlement lag. The fitting module is able to find the correct distribution and estimated distribution parameters are close to the assumptions in Table 1.

Table 3. Settlement Lag Fitting Result

LoB Distribution

Parameter Standard Deviation1

DoF2 KS Test3

p value

Log likelihood

AIC BIC

Home

Normal mean:167.03; sd:29.992;

0.3151; 0.221;

9057 0.04 0 -43,714 87,432 87,446

Lognormal meanlog:5.098; sdlog:0.198;

0.0021; 0.0015;

9057 0.08 0 -44,383 88,769 88,783

Pareto Fitting UnsuccessfulWeibull shape:6.479;

scale:179.004; 0.0532; 0.3056;

9057 0.01 0.11 -43,554 87,112 87,126

Gamma shape:27.457; scale:6.071;

0.3994; 0.0891;

9057 1 0 -44,091 88,187 88,201

Uniform Fitting UnsuccessfulExponential rate:0.006; 1e-04; 9058 0.45 0 -55,410 110,82

1110,82

8

Auto

Normal mean:77.587; sd:411.109;

5.5655; 4.3357;

25414 0.43 0 -187,547 375,098

375,114

Lognormal meanlog:5.166; sdlog:1.282;

0.008; 0.0057;

25414 0.08 0 -173,671 347,346

347,362

Pareto Fitting Unsuccessful

7

Page 8: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Weibull shape:1.005; scale:311.572;

0.0049; 2.0458; 25414 0.01 0.23 -171,332 342,66

9342,68

5Gamma shape:1.008;

scale:309.914; 0.0079; 3.1273;

25414 1 0 -171,333 342,670

342,687

Uniform Fitting UnsuccessfulExponential

rate:0.003; 0; 25415 0.01 0.03 -171,333 342,669

342,677

Notes:

1. Standard deviation of the parameter estimation.2. Degree of freedom, which is the number of data points – number of parameters3. Kolmogorov–Smirnov test statistic. Its p value is listed in the next column.

Figure 3 and Figure 4 compare the data and the chosen distributions for settlement lag.

Figure 3. Settlement Lag Fitting for Home Line

8

Page 9: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Figure 4. Settlement Lag Fitting for Auto Line

Monthly Frequency

9

Page 10: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Table 4 lists the validation result for monthly frequency. The fitting module found negative binomial distribution is a slightly better fit than Poisson distribution based on AIC and BIC. The assumption is Poisson distribution. The fitting for frequency is not as perfect as for other variables such as report lag. It is because the number of observations is much smaller for monthly frequency. For 10 years’ experience data, only 120 data points are available for monthly frequency but thousands for other variables. However, the estimated Poisson distribution parameters are close to the assumptions in Table 1.

Table 4. Monthly Frequency Fitting Result

LoB Distribution Parameter Standard Deviation1

DoF2 Chi-Sq Test3

p value

Log likelihood

AIC BIC

Home

Poisson lambda:101.067; 4 0.9858; 103 18,377 0 -439 881 884Negative Binomial

size:101.61; prob:0.501;

27.6725; 0.0682;

102 122 0 -424 851 857

Geometric prob:0.01; 9e-04; 103 218 0 -585 1,171 1,174

Auto

Poisson lambda:169.017; 5 1.1868; 119 12 0.38 -457 916 919Negative Binomial

size:2102.025; prob:0.926;

260.2602; 0.0085;

118 14 0.24 -459 922 927

Geometric prob:0.006; 5e-04; 119 148 0 -736 1,474 1,477Notes:

1. Standard deviation of the parameter estimation.2. Degree of freedom, which is the number of data points – number of parameters3. Chi-Square test statistic. Its p value is listed in the next column.4. The annual frequency assumption is Poisson with equals 1200, which is close to monthly frequency

estimate 101.067 × 12 = 1,213.5. The annual frequency assumption is Poisson with equals 2000, which is close to monthly frequency

estimate 169.017 × 12 = 2,028.

By looking at the comparison graphs (Figure 5 and Figure 6), not much differences can be told between Poisson distribution fitting and Negative Binomial distribution fitting.

10

Page 11: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Figure 5. Monthly Frequency Fitting for Home Line

Poisson Distribution Negative Binomial Distribution

Figure 6. Monthly Frequency Fitting for Auto Line

Poisson Distribution Negative Binomial Distribution

Severity

Table 5 lists the validation result for severity. With the presence of deductible and limit, the loss data is truncated. The underlying severity distribution before deductible and limit is derived using MLE. AIC and BIC of successfully fitted distributions are too close to have a decisive conclusion of the best distribution.

11

Page 12: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

However, the estimated distribution parameters (Lognormal distribution) are close to the assumptions in Table 1.

Table 5. Severity Fitting Result

LoB Distribution Parameter Converge1 DoF2 Log likelihood

AIC BIC

Home

Normal mean:182872.53313; sd:69683.70117;

successful convergence

9057 -113,510 227,025 245,139

Lognormal meanlog:12.05007; sdlog:0.41117;

successful convergence

9057 -113,962 227,927 246,041

Pareto

Fitting UnsuccessfulWeibullGammaUniformExponential rate:1e-05; successful

convergence9058 -117,875 235,753 244,810

Auto

Normal mean:7582.14576; sd:5121.41155;

successful convergence

25414

-247,315 494,633 545,461

Lognormal meanlog:8.92714; sdlog:0.52153;

successful convergence

25414

-246,412 492,828 543,656

Pareto

Fitting UnsuccessfulWeibullGammaUniformExponential rate:0.00013; successful

convergence2541

5-252,438 504,877 530,291

Notes:

1. Convergence status of the fitting.2. Degree of freedom, which is the number of data points – number of parameters

Figure 7 and Figure 8 compare the loss data and the truncated Lognormal distributions for severity. The deductible and limit may not be uniform for all claims. An average deductible and limit is used to show the truncated distribution. Therefore, discrepancy may be spotted at both ends because of the averaging. However, it is only for the ease of presentation.

12

Page 13: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Figure 7. Severity Fitting for Home Line

Figure 8. Severity Fitting for Auto Line

13

Page 14: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

3.2 Copula FittingThe fitting module can estimate the relationship among severity, settlement lag and report lag within each business line/claim type. It can also estimate the relationship of monthly frequency among business lines.

Copula among severity, report lag and settlement lag

Table 6 shows the fitting results for copula among severity and lags.

Table 6. Severity, Report Lag and Settlement Lag Copula Fitting Result

LoB Copula Parameter1 Standard Deviation2 DoF3 Sn4 p value

Home

normal 0.9034; -0.0154; -0.0147 0.0025;0.0106;0.0101

3.4615 0.8284

clayton 0.4507 0.0065 23.4856

0.0025

gumbel 1.2397 0.0048 27.1433

0.5597

frank 2.0104 0.0318 22.1095

0.0025

joe 1.2767 0.007 36.2521

0.4851

t 0.9219; -0.0162; -0.0162 0.0019;0.0119;0.0123

7.729163

NA NA

Auto

normal 0.003; 9e-04; -0.0162 0.0059;0.0059;0.0059

NA NA

clayton 0 NA NA NAgumbel 1 0.0015 NA NAfrank 0 NA NA NAjoe 1 0.002 NA NAt 0.003; 9e-04; -0.0163 0.0059;0.0059;0.005

9363.773

5NA NA

Notes:1. Parameter: copula parameter. For normal and t copula, the first parameter is the correlation between

severity and settlement lag, the second one is between severity and report lag, and the third one is between settlement lag and report lag.

2. Standard Deviation of the parameter estimation.3. Degree of freedom, which is the number of data points – number of parameters.4. Sn: Cramer-von Mises Statistic. p-value of Sn test is listed in the next column.

In test data generation, the Home line assumes a frank copula between severity and settlement lag with parameter equals 50. The report lag is assumed to be independent from severity and settlement lag. It uses three copulas here: one frank copula and two independent copulas. Since the simulator fit the relationship among severity, report lag and settlement lag together using one copula to be more consistent and comprehensive, normal copula is found to be more appropriate to describe the relationship. Report lag is found to have near zero correlation with severity and settlement lag, as indicated by the independency assumption. Loss size after deductible and limit and settlement lag has an estimated correlation of 90.3%. The p value of the statistical test is 0.83 which does not deny the

14

Page 15: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

hypothesis of a 90.3% correlation. Considering the impact of deductible and limit, this is close to the correlation (around 99%) derived from a frank copula with parameter equals 50.

The Auto line assumes severity, report lag and settlement lag are independent from each other. The fitting result shows near-zero correlation as well.

Figure 9 and Figure 10 compares the test data with the chosen copula for each business line.

Figure 9. Copula Fitting for Home Line

Lognormal

Notes: Margin 1: Severity with Lognormal distribution Margin 2: Settlement Lag with Weibull distribution Margin 3: Report Lag with Weibull distribution

Figure 10. Copula Fitting for Auto Line

Notes: Margin 1: Severity with Lognormal distribution Margin 2: Settlement Lag with Exponential distribution Margin 3: Report Lag with Exponential distribution

Frequency Copula

15

Lognormal

Page 16: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Table 7 shows the copula fitting result for the monthly frequency between the two business lines. The assumption is that the annual frequencies follow a normal (Gaussian) copula with a correlation coefficient of 85%. The fitting result shows that the correlation coefficient lies in the range of [50.5%, 68.2%] ([-2,+2]). The discrepancy is likely to be caused by insufficient number of data points. Only 10 pairs of annual frequencies are generated based on the assumed normal copula. A longer history is likely to reduce the discrepancy.

Table 7. Frequency Copula Fitting Result

Copula Parameter1 Standard Deviation2

DoF3 Sn4 p value

normal 0.5936 0.0444 0.1159 0.0025clayton 0.7311 0.1827 0.3237 0.0025gumbel 1.7872 0.1134 0.0664 0.0423frank 4.6165 0.6222 0.093 0.0025joe 2.2788 0.1507 0.0594 0.0672t 0.6131 0.0626 4.74955242 NA NA

3.3 Exposure IndexThe impact of exposure index needs to be considered in frequency distribution fitting. Frequency data is normalized by removing the impact of exposure change by time.

The test data assumes an 8% business volume increase for 10 years for Auto line, with 2000 expected claims in the first year. The fitting results indicate an expected number of 2028 claims per year after removing the impact of business volume change. This indicates exposure index has been reflected in the fitting module approximately. Exposure index can incorporate not only business volume but also other cyclical patterns such as underwriting cycle and seasonality as well.

3.4 Report Lag Impact on FrequencyGiven that some business lines may have very long report lag, experience data may be much truncated as quite a few IBNR claims are not observable for recent accident years. The test data is generated to have considerably long report lags. The home line has an expected report lag of 91 days. The Auto line has an expected report lag of 759 days. The fitting module considers the possibility of IBNR claims when adjusting the frequency data. The result of frequency distribution fitting in Section 3.1 shows that the impact of report lag has been appropriately taken into account.

4. SimulationTo make sure the simulation module is working properly, the data simulated by the CAS Individual Claim Simulator is tested against the simulation assumptions. Continuing with the fitting test, two lines (Home and Auto) are simulated 100 times for open claim loss development, claim reopen, IBNR and future claims. Table 8 lists the simulation assumptions.

16

Page 17: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Table 8. Simulation Assumptions

Business Line Home AutoClaim Type Dwelling PD

Open ClaimOpen Claim Loss Development

Dev. Year

Mean Dev. Factor

Volatility

0 1.2 0.0511 1.15 0.0422 1.1 0.0413 1.05 0.0934 & + 1 0

Dev . factor=e0.001+0.01×dev . year+0.008 e

dev. Year: development yeare: random variable follows standard normal distribution

Claim ReopenReopen Probability Dev.

YearProb.

0 0.021 0.0152 0.013 0.0054 & + 0

Dev. Year

Prob.

0 0.021 0.0152 0.013 0.0054 & + 0

Reopen Claim Loss Development

Dev. Year

Mean Dev. Factor

Volatility

0 1.05 0.0951 1.1 0.0842 1.05 0.073 1.06 0.0784 1.07 0.0255 1.08 0.0796 1.09 0.0137 1.06 0.0538 & + 1 0

Dev. Year

Mean Dev. Factor

Volatility

0 1.05 0.0951 1.1 0.0842 1.05 0.073 1.06 0.0784 1.07 0.0255 1.08 0.0796 1.09 0.0137 1.06 0.0538 & + 1 0

Reopen Lag Exponential (=0.005) Exponential (=0.005)Resettlement Lag Exponential (=0.01) Exponential (=0.005)

IBNR/Future ClaimMonthly Frequency Poisson ( = 101.067) Negative Binomial (size=2102.025,

prob=0.926)Exposure Index Level 8% Annual IncreaseReport Lag Weibull (shape=9.276,

scale=798.361)Exponential (=0.011)

17

Page 18: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Settlement Lag Weibull (shape=6.479, scale=179.004)

Weibull (shape=1.005, scale=311.572)

Severity Lognormal (=12.05007, =0.41117) Lognormal (=8.92714, =0.52153)Severity Index Level Level till 2017 and 3% annual increase

thereafterCorrelation among Severity, Settlement Lag, and Report Lag

Normal CopulaSeverity Settle.

LagReport Lag

Severity

1

Settle. Lag

0.9034 1

Report Lag

-0.0154 -0.0147

1

Normal CopulaSeverity Settle.

LagReport Lag

Severity

1

Settle. Lag

0.003 1

Report Lag

0.009 -0.0162

1

Deductible 20000 1000Limit Limit Prob.

200000 37.5%300000 62.5%

Limit Prob.8000 20%15000 30%20000 50%

LAE LAE=5+0.01×dev . year+0.005×inccured loss+5 edev. Year: development yeare: random variable follows standard normal distribution

LAE=log (1.05+0.01e )

Frequency Correlation

Normal copula (0.6131)

The remainder of this section summarizes the testing result, with R code performing the testing included in the appendix.

4.1 Open Claim Loss DevelopmentOpen claim loss development patterns derived from simulated data match the simulation assumptions quite well.

For the Home line, a loss development factor table based on development year is assumed. For the Auto line, an exponential regression function is assumed. The testing result is shown in Table 9.

Table 9. Open Claim Loss Development Testing Result

Business Line Assumption Test ResultHome Dev.

YearMean Dev. Factor

Volatility

0 1.2 0.0511 1.15 0.0422 1.1 0.0413 1.05 0.0934 & + 1 0

Dev. Year

Mean Dev. Factor

Volatility

0 NA NA1 1.1513 0.039852 1.1002 0.041773 NA NA4 & + NA NA

18

Page 19: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Auto Dev . factor=e0.001+0.01×dev . year+¿0×incurred Loss+0.008 e

Dev . factor=e0.001001+0.01004 ×dev . year

−1.8e-08×incurredLoss+0.007991e

Adjusted-R2: 58.4%t test on intercept and parameter for dev. Year has a p value less than 2.2e-16

4.2 Claim ReopennessFor claim reopenness, simulated reopen probability, resettlement lag and reopen loss development have been tested against simulation assumptions and they match well, as shown from Table 10 to Table 12.

Reopen Probability

Table 10. Claim Reopen Probability Testing Result

Business Line Assumption Test ResultHome Dev.

YearProb.

0 0.021 0.0152 0.013 0.0054 & + 0

Dev. Year

Prob. Volatility

0 0.01959 0.00331 0.01480 0.00312 0.00898 0.00303 0.00536 0.00194 & + 0 0

Auto Dev. Year

Prob.

0 0.021 0.0152 0.013 0.0054 & + 0

Dev. Year

Prob. Volatility

0 0.01901 0.00241 0.01517 0.00222 0.01069 0.00153 0.00520 0.00134 & + 0

Resettlement Lag

Table 11. Resettlement Lag Simulation Testing Result

Business Line Home AutoAssumption Exponential (=0.01) Exponential (=0.005)Test Result Exponential (=0.00965)

Standard error of estimate:0.000394

Exponential (=0.00532)Standard error of estimate:0.000121

K-S Test against Assumption

Statistic: 0.0401p-value: 0.3026

Statistic: 0.0424p-value: 0.0034

19

Page 20: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Reopen Loss Development

Table 12. Claim Reopen Loss Development Testing Result

Business Line Assumption Test ResultHome Dev.

YearMean Dev. Factor

Volatility

0 1.05 0.0951 1.1 0.0842 1.05 0.073 1.06 0.0784 1.07 0.0255 1.08 0.0796 1.09 0.0137 1.06 0.0538 & + 1 0

Dev. Year

Mean Dev. Factor

Volatility

0 NA NA1 NA NA2 1.0488 0.05293 1.0648 0.06104 1.0727 0.02415 1.0651 0.06566 1.0837 0.01017 NA NA8 & + NA NA

Auto Dev. Year

Mean Dev. Factor

Volatility

0 1.05 0.0951 1.1 0.0842 1.05 0.073 1.06 0.0784 1.07 0.0255 1.08 0.0796 1.09 0.0137 1.06 0.0538 & + 1 0

Dev. Year

Mean Dev. Factor

Volatility

0 1.0451 0.081031 1.0906 0.074332 1.0404 0.068363 1.0520 0.079814 1.0670 0.023445 1.0574 0.065146 1.0795 0.013847 1.0679 0.059888 & + NA NA

Simulated reopen lag (from the last close date to reopen date) cannot be directly compared to the simulation assumption as it is truncated at the evaluation date (2017-12-31). Reopen dates are checked to make sure they are later than the evaluation date. As expected, simulated reopen dates for old closed claims are very close to the evaluation date, because the reopen date lag is assumed to have a mean of 200 days.

4.3 DistributionReport lag, settlement lag, frequency and severity distributions derived from the simulated data match the simulation assumptions quite well, as shown from Table 13 to Table 16.

20

Page 21: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Report Lag

Table 13. Report Lag Simulation Testing Result

Business Line Home AutoAssumption Weibull (shape=9.276,

scale=798.361)Exponential (=0.011)

Test Result Weibull (shape=8.787, scale=768.0863)Standard error of estimate:0.0623,0.8348

Exponential (=0.0153)

Standard error of estimate:0.00007

K-S Test against Assumption

Statistic: 0.1311p-value: 0

Statistic: 0.2787044p-value: 0

Settlement Lag

Table 14. Settlement Lag Simulation Testing Result

Business Line Home AutoAssumption Weibull (shape=6.479,

scale=179.004)Weibull (shape=1.005, scale=311.572)

Test Result Weibull (shape=6.4828, scale=179.1017)Standard error of estimate:0.0260, 0.1499

Weibull (shape=1.0032, scale=311.3343)Standard error of estimate:0.0033, 1.3829

K-S Test against Assumption

Statistic: 0.0106p-value: 0.0004

Statistic: 0.0032p-value: 0.6001

Frequency

Table 15. Frequency Simulation Testing Result

Business Line Home AutoAssumption Poisson ( = 101.067) Negative Binomial (size=2102.025,

prob=0.926)Mean = 168, Variance = 181

Test Result Poisson ( = 100.6617)

Standard error of estimate:0.2896

Negative Binomial (size=1165.012, prob=0.8747)Mean = 167, Variance = 191

Chi-Squared Test Statistic: 13.827 Statistic: 31.63137

21

Page 22: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

against Assumption p-value: 0.2427 p-value: 0.0009

Severity

Table 16. Severity Simulation Testing Result

Business Line Home AutoAssumption Lognormal (=12.05007,

=0.41117)Lognormal (=8.92714, =0.52153)

Test Result Lognormal (=12.05186, =0.41016)Standard error of estimate:0.0021, 0.0015

Lognormal (=8.926003, =0.5196)

Standard error of estimate:0.0022, 0.0016

K-S Test against Assumption

Statistic:0.0052p-value:0.2603

Statistic:0.0030p-value:0.6879

4.4 CopulaFrequency Copula

It is assumed that the monthly frequency between the Home line and the Auto line follows a normal copula with correlation coefficient equals 61.31%. The simulated data exhibits the similar relationship. The fitted normal copula has a correlation coefficient of 63.7% with an standard error of 1.7%. A goodness-of-fit test based on Cramer-von Mises Statistic is also performed with the test statistic equals 0.0152 and the p-value equals 0.898. This means that the test does not deny the hypothesis that the simulated data has the same frequency relationship as assumed.

Severity, Settlement Lag and Report Lag

The testing results for copula among severity, settlement lag and report lag also show that the simulator reflect the copula assumptions properly.

Table 17. Severity, Settlement Lag and Report Lag Copula

Business Line Home AutoAssumption Normal Copula

Severity Settle. Lag

Report Lag

Severity

1

Settle. Lag

0.9034 1

Report Lag

-0.0154 -0.0147

1

Normal CopulaSeverity Settle.

LagReport Lag

Severity

1

Settle. Lag

0.003 1

Report Lag

0.009 -0.0162

1

Test Result Normal Copula Severity Settle. Report

Normal CopulaSeverity Settle. Report

22

Page 23: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Lag LagSeverity 1Settle. Lag

0.90256 1

Report Lag

-0.0330 -0.0343

1

Lag LagSeverity 1Settle. Lag

0.00004 1

Report Lag

-0.0094 -0.021 1

Goodness-of-fit Test

Test Statistic: 0.01412p-value: 0.9776

Test Statistic: 0.01752p-value: 0.9975

4.5 Exposure IndexThe Home line assumes a level exposure index which means that the business volume stays unchanged. The average exposure index based on simulated data fluctuates around the level line, as shown in Figure 11.

Figure 11. Home Line Exposure Index Test

23

Page 24: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

The Auto line assumes an 8% annual increase in exposure. The average exposure index based on simulated data indicates the same pattern, as shown in Figure 12.

Figure 12. Auto Line Exposure Index Test

4.6 LAEIn the simulation, it is assumed that the Home line LAE = 5+0.01×dev . year+0.005×inccured loss+5e. The testing result shows that Home line LAE = 5.001+0.01361×dev . year+0.005×inccured loss+4.997 e. The adjusted R2 equals 99.97%.

The Auto line LAE is assumed to follow log (1.05+0.01 e ). The testing results shows that Auto line LAE = log (1.05+0.00002×dev . year+0×inccured loss+0.01046 e ), which is very close to the assumption.

4.7 Deductible and LimitThe simulated data is tested to make sure that

Ultimate Loss = Min(Max(Deductible, Severity), Deductible + Limit) – Deductible

Here, severity is the loss before deductible and loss.

The distributions of deductible and limit derived from the simulated data match the simulation assumptions very well, as shown in Table 18.

Table 18. Deductible and Limit Simulation Testing Result

24

Page 25: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Business Line Item Assumption Test ResultHome Deductible 20000 20000

Limit Limit Prob.200000 37.5%300000 62.5%

Limit Prob.200000 37.5%300000 62.5%

Auto Deductible 1000 1000Limit Limit Prob.

8000 20%15000 30%20000 50%

Limit Prob.8000 19.9%15000 30.3%20000 49.8%

In addition to the tests mentioned above, some basic checks of the simulated data have been conducted as well.

1. Report date >= Occurrence date2. Settlement date >= Report date3. Claim reopen date >= Settlement date (Last close date)4. Claim reopen date >= Evaluation date (2017-12-31)5. Resettlement date >= Claim reopen date6. IBNR Report date > Evaluation date (2017-12-31)7. Future claim occurrence date is between evaluation date (2017-12-31) and simulation end date

(2008-12-31).

With the satisfactory testing results, the simulator captures the simulation assumptions well.

25

Page 26: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

Appendix. Simulation Module Test R CodeThe following R code is used to test the simulation module as described in Section 4. Simulation.

###########################Simulation Module Test##############################################

#Read in simulated datasetwd("C:/temp/CAS/test")simdata <- read.csv("sim.csv")simdata <- simdata[simdata$Sim<11,] #You may want to use the first 10 portfolio simulations to save timegc()

###########################Basic Check#################################################Report date >= Occurrence datesum(as.numeric(as.Date(simdata$reportDate)-as.Date(simdata$occurrenceDate))<0)#Settlement date >= Report datesum(as.numeric(as.Date(simdata$settlementDate)-as.Date(simdata$reportDate))<0)#Reopen date >= Settlement datefitdata <- simdata[!is.na(simdata$reopenDate),]sum(as.numeric(as.Date(fitdata$reopenDate)-as.Date(fitdata$settlementDate))<0)#Reopen date >= Evaluation date (2017-12-31)sum(as.numeric(as.Date(fitdata$reopenDate)-as.Date("2017-12-31"))<0)#Resettle date >= Reopen datefitdata <- simdata[!is.na(simdata$reopenDate),]sum(as.numeric(as.Date(fitdata$resettleDate)-as.Date(fitdata$reopenDate))<0)#IBNR Report date > Evaluation date (2017-12-31)fitdata <- simdata[simdata$status == "IBNR",]sum(as.numeric(as.Date(fitdata$reportDate)-as.Date("2017-12-31"))<0)#UPR Occurrence date > Evaluation date (2017-12-31)fitdata <- simdata[simdata$status == "UPR",]sum(as.numeric(as.Date(fitdata$occurrenceDate)-as.Date("2017-12-31"))<0)#UPR Occurrence date <= Future date (2018-12-31)sum(as.numeric(as.Date(fitdata$occurrenceDate)-as.Date("2018-12-31"))>0)

###########################Test Open Claim Loss Development#############################openc <- simdata[simdata[,"status"]=="OPEN",]###Development year at the valuation date where incurred losses are recordedopenc[,"devYears"] <- ceiling(as.numeric(as.Date("2017-12-31")-as.Date(openc$occurrenceDate))/365)###Development year at the settlement dateopenc[,"settleYears"] <- ceiling(as.numeric(as.Date(openc$settlementDate)-as.Date(openc$occurrenceDate))/365)###Cumulative development factors from valuation date to settlement dateopenc[,"cdf"] <- openc$ultimateLoss/openc$incurredLoss

###Function to calculate expected cumulative development factors

26

Page 27: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

CumDevFac <- function(devYears,settleYears,meanDevFac){nDevFac<-pmin(length(meanDevFac),settleYears-1)n<-length(meanDevFac)result<-vector()for (i in c(1:length(nDevFac))) {

if (is.na(nDevFac[i]) == TRUE){result <- c(result,NA)

} else {if(devYears[i]==settleYears[i]){

result <- c(result,1)} else {

result <- c(result,prod(meanDevFac[pmin(devYears[i],nDevFac[i]):nDevFac[i]]))}

}}result

}

###Calculate expected cumulative development factors###This is for Home LinemeanDevFac <- c(1.2,1.15,1.1,1.05,1)#This is the assumed expected mean development factor.openc[,"excdf"] <- CumDevFac(openc$devYears,openc$settleYears,meanDevFac)exagg<-aggregate(excdf ~ devYears + settleYears, data = openc[openc[,"LoB"]=="Home",], mean)#This is the mean development factor from the simulated data.agg<-aggregate(cdf ~ devYears + settleYears, data = openc[openc[,"LoB"]=="Home",], mean)#This is the standard deviation of the development factor from the simulated data.aggsd<-aggregate(cdf ~ devYears + settleYears, data = openc[openc[,"LoB"]=="Home",], sd)

#This is the expected mean development factor from the simulated data.openc[,"expectedcdf"] <- openc$expectedLoss/openc$incurredLossexeagg<-aggregate(expectedcdf ~ devYears + settleYears, data = openc[openc[,"LoB"]=="Home",], mean)

###This is for Auto Lineregdata <- openc[openc[,"LoB"]=="Auto",]

#Function to calculate severity indexgetindex <- function(monthlyindex,startDate,dates) {

years <- as.numeric(substr(as.character(dates),1,4))months <- as.numeric(substr(as.character(dates),6,7))startyear <- as.numeric(substr(as.character(startDate),1,4))startmonth <- as.numeric(substr(as.character(startDate),6,7))indices <- pmax(1,pmin(360,(years-startyear)*12+(months-startmonth)+1))monthlyindex[indices]

}

#Calculate severity index startDate <- as.Date("2008-01-01")monthlyindex <- c(rep(1,120),cumprod(c(1*1.03^(1/12),rep(1.03^(1/12),239))))

27

Page 28: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

severityindex <- getindex(monthlyindex, startDate, regdata$settlementDate)

#Expected cumulative development factor after detrendingregdata[,"expectedcdf"] <- regdata$expectedLoss/regdata$incurredLoss/severityindex

regdata <- regdata[,colnames(regdata) %in% c("cdf","devYears","settleYears","incurredLoss","osRatio","expectedcdf")]regdata$cdf <- log(regdata$cdf/severityindex)f <- as.formula(cdf ~ devYears + incurredLoss + osRatio)exeagg<-aggregate(expectedcdf ~ devYears + settleYears, data = regdata, mean)lm <- lm(f,data = regdata)summary(lm)

###########################Test Reopen Probability##############################Home Lineclosedata <- simdata[simdata$LoB == "Home" & simdata$status=="CLOSED", ]#Close year at the valuation datecloselags <- as.numeric(as.Date("2017-12-31") - as.Date(closedata[,"settlementDate"]))closedata[,"closeYears"] <- pmax(1,ceiling(closelags/365))

#Calculate reopen probability by development year for each simulationreopendata <- closedata[!is.na(closedata$reopenDate), ]agg<-aggregate(ClaimID ~ closeYears + Sim, data = closedata, length)reopenagg<-aggregate(ClaimID ~ closeYears + Sim, data = reopendata, length)agg$reopenProb <- 0for (i in c(1:nrow(agg))){

for (j in c(1:nrow(reopenagg))){if (agg[i,1] == reopenagg[j,1] & agg[i,2] == reopenagg[j,2]) {

agg[i,4] = reopenagg[j,3]/agg[i,3]}

}}meanprob <- aggregate(reopenProb ~ closeYears, data = agg, mean)meanprobsdprob <- aggregate(reopenProb ~ closeYears, data = agg, sd)sdprob

###Auto Lineclosedata <- simdata[simdata$LoB == "Auto" & simdata$status=="CLOSED", ]#Close year at the valuation datecloselags <- as.numeric(as.Date("2017-12-31") - as.Date(closedata[,"settlementDate"]))closedata[,"closeYears"] <- pmax(1,ceiling(closelags/365))

#Calculate reopen probability by development year for each simulationreopendata <- closedata[!is.na(closedata$reopenDate), ]agg<-aggregate(ClaimID ~ closeYears + Sim, data = closedata, length)reopenagg<-aggregate(ClaimID ~ closeYears + Sim, data = reopendata, length)agg$reopenProb <- 0

28

Page 29: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

for (i in c(1:nrow(agg))){for (j in c(1:nrow(reopenagg))){

if (agg[i,1] == reopenagg[j,1] & agg[i,2] == reopenagg[j,2]) {agg[i,4] = reopenagg[j,3]/agg[i,3]

}}

}meanprob <- aggregate(reopenProb ~ closeYears, data = agg, mean)meanprobsdprob <- aggregate(reopenProb ~ closeYears, data = agg, sd)sdprob

###########################Test Resettlement Lag################################Home Line#Get resettlement lag datafitdata <- simdata[simdata$LoB == "Home" & simdata$status=="CLOSED" & !is.na(simdata$reopenDate), ]resettlementLags <- as.numeric(as.Date(fitdata[,"resettleDate"])-as.Date(fitdata[,"reopenDate"]))resettlementLags <- ifelse(resettlementLags==0,runif(1),resettlementLags)rm(fitdata)gc()#Resettlement lag distribution fittingfit<-fitdist(resettlementLags, distr="exp", method="mle", discrete=FALSE)

summary(fit)

#Distribution fitting k-s test#This is to make sure there is no tie in the data for k-s testx<-resettlementLags + max(abs(resettlementLags))*0.0001*runif(length(resettlementLags),0,1)#k-s testz<-ks.test(x,"pexp", 0.01)z$statisticz$p.value

###Auto Line#Get resettlement lag datafitdata <- simdata[simdata$LoB == "Auto" & simdata$status=="CLOSED" & !is.na(simdata$reopenDate), ]resettlementLags <- as.numeric(as.Date(fitdata[,"resettleDate"])-as.Date(fitdata[,"reopenDate"]))resettlementLags <- ifelse(resettlementLags==0,runif(1),resettlementLags)rm(fitdata)gc()#Resettlement lag distribution fittingfit<-fitdist(resettlementLags, distr="exp", method="mle", discrete=FALSE)

summary(fit)

#Distribution fitting k-s test#This is to make sure there is no tie in the data for k-s testx<-resettlementLags + max(abs(resettlementLags))*0.0001*runif(length(resettlementLags),0,1)#k-s test

29

Page 30: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

z<-ks.test(x,"pexp", 0.005)z$statisticz$p.value

###########################Test Reopen Claim Loss Development###########################Function to calculate expected cumulative development factorsCumDevFac <- function(devYears,settleYears,meanDevFac){

nDevFac<-pmin(length(meanDevFac),settleYears-1)n<-length(meanDevFac)result<-vector()for (i in c(1:length(nDevFac))) {

if (is.na(nDevFac[i]) == TRUE){result <- c(result,NA)

} else {if(devYears[i]==settleYears[i]){

result <- c(result,1)} else {

result <- c(result,prod(meanDevFac[pmin(devYears[i],nDevFac[i]):nDevFac[i]]))}

}}result

}

###Home Line#Get reopen claim datafitdata <- simdata[simdata$LoB == "Home" & simdata$status=="CLOSED" & !is.na(simdata$reopenDate), ]#Development year at the valuation datefitdata[,"devYears"] <- ceiling(as.numeric(as.Date("2017-12-31")-as.Date(fitdata$occurrenceDate))/365)#Development year at the resettlement datefitdata[,"settleYears"] <- ceiling(as.numeric(as.Date(fitdata$resettleDate)-as.Date(fitdata$occurrenceDate))/365)#Cumulative development factors from valuation date to resettlement datefitdata[,"cdf"] <- fitdata$reopenLoss/fitdata$incurredLoss

#Calculate expected cumulative development factorsmeanDevFac <- c(1.05,1.1,1.05,1.06,1.07,1.08,1.09,1.06,1)#This is the assumed expected mean development factor.fitdata[,"excdf"] <- CumDevFac(fitdata$devYears,fitdata$settleYears,meanDevFac)exagg<-aggregate(excdf ~ devYears + settleYears, data = fitdata, mean)#This is the mean development factor from the simulated data.agg<-aggregate(cdf ~ devYears + settleYears, data = fitdata, mean)#This is the standard deviation of the development factor from the simulated data.aggsd<-aggregate(cdf ~ devYears + settleYears, data = fitdata, sd)

#This is the expected mean development factor from the simulated data.fitdata[,"expectedcdf"] <- fitdata$expectedLoss/fitdata$incurredLossexeagg<-aggregate(expectedcdf ~ devYears + settleYears, data = fitdata, mean)

###Auto Line

30

Page 31: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

#Get reopen claim datafitdata <- simdata[simdata$LoB == "Auto" & simdata$status=="CLOSED" & !is.na(simdata$reopenDate), ]#Development year at the valuation datefitdata[,"devYears"] <- ceiling(as.numeric(as.Date("2017-12-31")-as.Date(fitdata$occurrenceDate))/365)#Development year at the resettlement datefitdata[,"settleYears"] <- ceiling(as.numeric(as.Date(fitdata$resettleDate)-as.Date(fitdata$occurrenceDate))/365)

#Function to calculate severity indexgetindex <- function(monthlyindex,startDate,dates) {

years <- as.numeric(substr(as.character(dates),1,4))months <- as.numeric(substr(as.character(dates),6,7))startyear <- as.numeric(substr(as.character(startDate),1,4))startmonth <- as.numeric(substr(as.character(startDate),6,7))indices <- pmax(1,pmin(360,(years-startyear)*12+(months-startmonth)+1))monthlyindex[indices]

}

#Calculate severity index startDate <- as.Date("2008-01-01")monthlyindex <- c(rep(1,120),cumprod(c(1*1.03^(1/12),rep(1.03^(1/12),239))))severityindex <- getindex(monthlyindex, startDate, fitdata$resettleDate)

#Cumulative development factors from valuation date to resettlement datefitdata[,"cdf"] <- fitdata$reopenLoss/fitdata$incurredLoss/severityindex

#Calculate expected cumulative development factorsmeanDevFac <- c(1.05,1.1,1.05,1.06,1.07,1.08,1.09,1.06,1)#This is the assumed expected mean development factor.fitdata[,"excdf"] <- CumDevFac(fitdata$devYears,fitdata$settleYears,meanDevFac)exagg<-aggregate(excdf ~ devYears + settleYears, data = fitdata, mean)#This is the mean development factor from the simulated data.agg<-aggregate(cdf ~ devYears + settleYears, data = fitdata, mean)#This is the standard deviation of the development factor from the simulated data.aggsd<-aggregate(cdf ~ devYears + settleYears, data = fitdata, sd)

#This is the expected mean development factor from the simulated data.fitdata[,"expectedcdf"] <- fitdata$expectedLoss/fitdata$incurredLoss/severityindexexeagg<-aggregate(expectedcdf ~ devYears + settleYears, data = fitdata, mean)

###########################Test Report Lag#########################################################Home Line#Get report lag datafitdata <- simdata[simdata$LoB == "Home" & simdata$status=="UPR", ]reportLags <- as.numeric(as.Date(fitdata[,"reportDate"])-as.Date(fitdata[,"occurrenceDate"]))reportLags <- ifelse(reportLags==0,runif(1),reportLags)rm(fitdata)gc()#Report lag distribution fitting

31

Page 32: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

fit<-fitdist(reportLags, distr="weibull", method="mle", discrete=FALSE)

summary(fit)

#Distribution fitting k-s test#This is to make sure there is no tie in the data for k-s testx<-reportLags + max(abs(reportLags))*0.0001*runif(length(reportLags),0,1)#k-s testz<-ks.test(x,"pweibull", 9.276, 798.361)z$statisticz$p.value

###Auto Line#Get report lag datafitdata <- simdata[simdata$LoB == "Auto" & simdata$status=="UPR", ]reportLags <- as.numeric(as.Date(fitdata[,"reportDate"])-as.Date(fitdata[,"occurrenceDate"]))reportLags <- ifelse(reportLags==0,runif(1),reportLags)rm(fitdata)gc()#Report lag distribution fittingfit<-fitdist(reportLags, distr="exp", method="mle", discrete=FALSE)

summary(fit)

#Distribution fitting k-s test#This is to make sure there is no tie in the data for k-s testx<-reportLags + max(abs(reportLags))*0.0001*runif(length(reportLags),0,1)#k-s testz<-ks.test(x,"pexp", 0.011)z$statisticz$p.value

###########################Test Settlement Lag##################################Home Line#Get settlement lag datafitdata <- simdata[simdata$LoB == "Home" & (simdata$status=="IBNR" | simdata$status=="UPR"), ]settlementLags <- as.numeric(as.Date(fitdata[,"settlementDate"])-as.Date(fitdata[,"reportDate"]))settlementLags <- ifelse(settlementLags==0,runif(1),settlementLags)rm(fitdata)gc()#Settlement lag distribution fittingfit<-fitdist(settlementLags, distr="weibull", method="mle", discrete=FALSE)

summary(fit)

#Distribution fitting k-s test#This is to make sure there is no tie in the data for k-s testx<-settlementLags + max(abs(settlementLags))*0.0001*runif(length(settlementLags),0,1)#k-s test

32

Page 33: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

z<-ks.test(x,"pweibull", 6.479, 179.004)z$statisticz$p.value

###Auto Line#Get settlement lag datafitdata <- simdata[simdata$LoB == "Auto" & (simdata$status=="IBNR" | simdata$status=="UPR"), ]settlementLags <- as.numeric(as.Date(fitdata[,"settlementDate"])-as.Date(fitdata[,"reportDate"]))settlementLags <- ifelse(settlementLags==0,runif(1),settlementLags)rm(fitdata)gc()#Settlement lag distribution fittingfit<-fitdist(settlementLags, distr="weibull", method="mle", discrete=FALSE)

summary(fit)

#Distribution fitting k-s test#This is to make sure there is no tie in the data for k-s testx<-settlementLags + max(abs(settlementLags))*0.0001*runif(length(settlementLags),0,1)#k-s testz<-ks.test(x,"pweibull", 1.005, 311.572)z$statisticz$p.value

###########################Test Severity Distribution################################Home Line#Get severity datafitdata <- simdata[simdata$LoB == "Home" & (simdata$status=="IBNR" | simdata$status=="UPR"), ]severity <- fitdata$totalLossrm(fitdata)gc()#Severity distribution fittingfit<-fitdist(severity, distr="lnorm", method="mle", discrete=FALSE)

summary(fit)

#Distribution fitting k-s test#This is to make sure there is no tie in the data for k-s testx<-severity + max(abs(severity))*0.0001*runif(length(severity),0,1)#k-s testz<-ks.test(x,"plnorm", 12.05007, 0.41117)z$statisticz$p.value

###Auto Line#Get severity datafitdata <- simdata[simdata$LoB == "Auto" & (simdata$status=="IBNR" | simdata$status=="UPR"), ]severity <- fitdata$totalLoss#Remove severity trend

33

Page 34: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

#Function to calculate severity indexgetindex <- function(monthlyindex,startDate,dates) {

years <- as.numeric(substr(as.character(dates),1,4))months <- as.numeric(substr(as.character(dates),6,7))startyear <- as.numeric(substr(as.character(startDate),1,4))startmonth <- as.numeric(substr(as.character(startDate),6,7))indices <- pmax(1,pmin(360,(years-startyear)*12+(months-startmonth)+1))monthlyindex[indices]

}

#Calculate severity index startDate <- as.Date("2008-01-01")monthlyindex <- c(rep(1,120),cumprod(c(1*1.03^(1/12),rep(1.03^(1/12),239))))severityindex <- getindex(monthlyindex, startDate, fitdata$settlementDate)severity <- severity/severityindex

rm(fitdata)gc()

#Severity distribution fittingfit<-fitdist(severity, distr="lnorm", method="mle", discrete=FALSE)

summary(fit)

#Distribution fitting k-s test#This is to make sure there is no tie in the data for k-s testx<-severity + max(abs(severity))*0.0001*runif(length(severity),0,1)#k-s testz<-ks.test(x,"plnorm", 8.92714, 0.52153)z$statisticz$p.value

###########################Test Severity, Settlement Lag and Report Lag Copula##########################Home Line#Get data (Only future claims as severity, settlement lag and report lag are all simulatedfitdata <- simdata[simdata$LoB == "Home" & simdata$status == "UPR", ]fitdata$reportLag <- as.numeric(as.Date(fitdata[,"reportDate"])-as.Date(fitdata[,"occurrenceDate"]))fitdata$settlementLag <- as.numeric(as.Date(fitdata[,"settlementDate"])-as.Date(fitdata[,"reportDate"]))

#Remove severity trend#Function to calculate severity indexgetindex <- function(monthlyindex,startDate,dates) {

years <- as.numeric(substr(as.character(dates),1,4))months <- as.numeric(substr(as.character(dates),6,7))startyear <- as.numeric(substr(as.character(startDate),1,4))startmonth <- as.numeric(substr(as.character(startDate),6,7))indices <- pmax(1,pmin(360,(years-startyear)*12+(months-startmonth)+1))monthlyindex[indices]

}

34

Page 35: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

#Calculate severity index startDate <- as.Date("2008-01-01")monthlyindex <- c(rep(1,360))severityindex <- getindex(monthlyindex, startDate, fitdata$settlementDate)fitdata$severity <- fitdata$totalLoss/severityindex

fitdata <- fitdata[,colnames(fitdata) %in% c("severity", "settlementLag", "reportLag")]fitdata <- fitdata[,c(3,2,1)]gc()

library(copula)cop <- normalCopula(c(0,0,0), dim=3, dispstr="un")u <- pobs(fitdata)fitcop <- fitCopula(cop, u)

assumedcop <- normalCopula(c(0.9034,-0.0154,-0.0147), dim=3, dispstr="un")gof <- gofCopula(assumedcop, u, N=200, simulation="mult", method="Sn", ties=FALSE, hideWarnings = TRUE)

###Auto Line#Get data (Only future claims as severity, settlement lag and report lag are all simulatedfitdata <- simdata[simdata$LoB == "Auto" & simdata$status == "UPR", ]fitdata$reportLag <- as.numeric(as.Date(fitdata[,"reportDate"])-as.Date(fitdata[,"occurrenceDate"]))fitdata$settlementLag <- as.numeric(as.Date(fitdata[,"settlementDate"])-as.Date(fitdata[,"reportDate"]))

#Remove severity trend#Function to calculate severity indexgetindex <- function(monthlyindex,startDate,dates) {

years <- as.numeric(substr(as.character(dates),1,4))months <- as.numeric(substr(as.character(dates),6,7))startyear <- as.numeric(substr(as.character(startDate),1,4))startmonth <- as.numeric(substr(as.character(startDate),6,7))indices <- pmax(1,pmin(360,(years-startyear)*12+(months-startmonth)+1))monthlyindex[indices]

}

#Calculate severity index startDate <- as.Date("2008-01-01")monthlyindex <- c(rep(1,120),cumprod(c(1*1.03^(1/12),rep(1.03^(1/12),239))))severityindex <- getindex(monthlyindex, startDate, fitdata$settlementDate)fitdata$severity <- fitdata$totalLoss/severityindex

fitdata <- fitdata[,colnames(fitdata) %in% c("severity", "settlementLag", "reportLag")]fitdata <- fitdata[,c(3,2,1)]gc()

library(copula)cop <- normalCopula(c(0,0,0), dim=3, dispstr="un")u <- pobs(fitdata)

35

Page 36: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

fitcop <- fitCopula(cop, u)

assumedcop <- normalCopula(c(0.003,0.0009,-0.0162), dim=3, dispstr="un")u <- u[1:10000,] #Too much data. Using only the first 10000 to avoid memory limit break.gof <- gofCopula(assumedcop, u, N=200, simulation="mult", method="Sn", ties=FALSE, hideWarnings = TRUE)

###########################Test Frequency Distribution#############################Home Line#Get frequency data. Future claims are used because closed and open claims are fixed based on claim data and will underestimate the volatility.fitdata <- simdata[simdata$LoB == "Home" & simdata$status == "UPR", ]fitdata$index <- (as.numeric(substr(as.character(fitdata$occurrenceDate),1,4))-2008)*12+as.numeric(substr(as.character(fitdata$occurrenceDate),6,7))freqagg <- aggregate(ClaimID ~ index + Sim, data = fitdata, length)frequency <- freqagg[,3]rm(fitdata)gc()#Frequency distribution fittingfit<-fitdist(frequency, distr="pois", method="mle", discrete=TRUE)

summary(fit)

#Distribution fitting chi squared testl <- 101.067 #Assumptionx <- frequencym = mean(x)s=sqrt(var(x))

mybreak<-c(m-4*s, m-3*s,m-2*s,m-s, m-s/2, m-s/4,m,m+s/4,m+s/2,m+s,m+2*s,m+3*s,m+4*s)mybreak <-mybreak[mybreak>=0]mybreak<-unique(round(mybreak))

mycut<-cut(x,breaks = mybreak)empirical<-as.vector(table(mycut))mybreak2<-mybreak[seq(2, length(mybreak), by=1)]mybreak1<-mybreak[seq(1, length(mybreak)-1, by=1)]

prob<- ppois(mybreak2, l)-ppois(mybreak1, l)z<-chisq.test(empirical, p=prob, rescale.p=TRUE)z$statisticz$p.value

###Auto Line#Get frequency datafitdata <- simdata[simdata$LoB == "Auto" & simdata$status == "UPR", ]fitdata$index <- (as.numeric(substr(as.character(fitdata$occurrenceDate),1,4))-2008)*12+as.numeric(substr(as.character(fitdata$occurrenceDate),6,7))freqagg <- aggregate(ClaimID ~ index + Sim, data = fitdata, length)

36

Page 37: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

#Remove frequency trend (exposure index)freqagg[,3] <- freqagg[,3]/(1.08^(freqagg[,1]/12))

frequency <- round(freqagg[,3])rm(fitdata)gc()#Frequency distribution fittingfit<-fitdist(frequency, distr="nbinom", method="mle", discrete=TRUE, lower = c(0, 0))

summary(fit)

#Distribution fitting chi squared testsize <- 2102.025 #Assumptionp <- 0.926 #Assumptionx <- frequencym = mean(x)s=sqrt(var(x))

mybreak<-c(m-4*s, m-3*s,m-2*s,m-s, m-s/2, m-s/4,m,m+s/4,m+s/2,m+s,m+2*s,m+3*s,m+4*s)mybreak <-mybreak[mybreak>=0]mybreak<-unique(round(mybreak))

mycut<-cut(x,breaks = mybreak)empirical<-as.vector(table(mycut))mybreak2<-mybreak[seq(2, length(mybreak), by=1)]mybreak1<-mybreak[seq(1, length(mybreak)-1, by=1)]

prob<- pnbinom(mybreak2, size, p)-pnbinom(mybreak1, size, p)z<-chisq.test(empirical, p=prob, rescale.p=TRUE)z$statisticz$p.value

###########################Test Exposure Index#########################################Home Line#Get frequency datafitdata <- simdata[simdata$LoB == "Home", ]fitdata$index <- (as.numeric(substr(as.character(fitdata$occurrenceDate),1,4))-2008)*12+as.numeric(substr(as.character(fitdata$occurrenceDate),6,7))freqagg <- aggregate(ClaimID ~ index + Sim, data = fitdata, length)frequency <- aggregate(ClaimID ~ index, data = freqagg, mean)rm(fitdata)gc()

monthlyindex <- c(rep(1,360))

simindex <- c(rep(NA,360))

for (i in c(1:360)){

37

Page 38: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

for (j in c(1:nrow(frequency))){if (frequency[j,1]==i) {

simindex[i]=frequency[j,2]/frequency[1,2]}

}}

plot(simindex[1:132],xlab="Month",ylab="Index")lines(monthlyindex[1:132],col="green")

###Auto Line#Get frequency datafitdata <- simdata[simdata$LoB == "Auto", ]fitdata$index <- (as.numeric(substr(as.character(fitdata$occurrenceDate),1,4))-2008)*12+as.numeric(substr(as.character(fitdata$occurrenceDate),6,7))freqagg <- aggregate(ClaimID ~ index + Sim, data = fitdata, length)frequency <- aggregate(ClaimID ~ index, data = freqagg, mean)rm(fitdata)gc()

monthlyindex <- cumprod(c(1,rep(1.08^(1/12),359)))

simindex <- c(rep(NA,360))

for (i in c(1:360)){for (j in c(1:nrow(frequency))){

if (frequency[j,1]==i) {simindex[i]=frequency[j,2]/frequency[1,2]

}}

}

plot(simindex[1:132],xlab="Month",ylab="Index")lines(monthlyindex[1:132],col="green")

###########################Test Frequency Copula###################################Get frequency data (only future claims as open and closed claims are fixed, not simulated)fitdata <- simdata[simdata$status == "UPR",]fitdata$index <- (as.numeric(substr(as.character(fitdata$occurrenceDate),1,4))-2008)*12+as.numeric(substr(as.character(fitdata$occurrenceDate),6,7))freqagg <- aggregate(ClaimID ~ index + Sim + LoB, data = fitdata, length)home <- freqagg[freqagg$LoB=="Home",]auto <- freqagg[freqagg$LoB=="Auto",]home$auto <- ifelse(home$index==auto$index & home$Sim==auto$Sim,auto$ClaimID,NA)

#Remove Auto Frequency Trendhome$auto <- round(home$auto/(1.08^(home$index/12)))

fitdata <- home[,colnames(home) %in% c("ClaimID", "auto")]

38

Page 39: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

gc()

library(copula)cop <- normalCopula(c(0), dim=2, dispstr="un")u <- pobs(fitdata)fitcop <- fitCopula(cop, u)

assumedcop <- normalCopula(c(0.6131), dim=2, dispstr="un")gof <- gofCopula(assumedcop, u, N=200, simulation="mult", method="Sn", ties=FALSE, hideWarnings = TRUE)

###########################Test Deductible and Limit##################################Check if ultimate Loss is smaller than LimitultimateLosses <- ifelse(is.na(simdata$resettleDate), simdata$ultimateLoss, simdata$reopenLoss)sum(ultimateLosses > simdata$Limit)#Check if ultimate loss = min(max(deductible, total loss),deductible + Limit)-deductiblefitdata <- simdata[simdata$status == "IBNR" | simdata$status=="UPR", ]summary(fitdata$ultimateLoss - (pmin(pmax(fitdata$Deductible,fitdata$totalLoss),fitdata$Deductible+fitdata$Limit)-fitdata$Deductible))rm(fitdata)gc()

###Home Line#Get deductible and limit datadeductibles <- simdata[simdata$LoB == "Home",]$Deductiblefor (i in unique(deductibles)){

if (length(deductibles[deductibles==i])/length(deductibles)>0.1){print(paste0(i," ",round(length(deductibles[deductibles==i])/length(deductibles),3)))

}}

limits <- simdata[simdata$LoB == "Home",]$Limitfor (i in unique(limits)){

if (length(limits[limits==i])/length(limits)>0.1){print(paste0(i," ",round(length(limits[limits==i])/length(limits),3)))

}}

###Auto Line#Get deductible and limit datadeductibles <- simdata[simdata$LoB == "Auto",]$Deductiblefor (i in unique(deductibles)){

if (length(deductibles[deductibles==i])/length(deductibles)>0.1){print(paste0(i," ",round(length(deductibles[deductibles==i])/length(deductibles),3)))

}}

limits <- simdata[simdata$LoB == "Auto",]$Limitfor (i in unique(limits)){

39

Page 40: 1. · Web viewKolmogorov–Smirnov test statistic. Its p value is listed in the next column. Figure 1 and Figure 2 compare the data and the chosen distributions for report lag. Figure

if (length(limits[limits==i])/length(limits)>0.1){print(paste0(i," ",round(length(limits[limits==i])/length(limits),3)))

}}

###########################Test LAE#########################################################fitdata <- simdata[simdata["status"]=="IBNR" | simdata[,"status"]=="UPR",]###Development year at the valuation date where incurred losses are recordedfitdata[,"devYears"] <- ceiling(as.numeric(as.Date(fitdata$reportDate)-as.Date(fitdata$occurrenceDate))/365)

###This is for Home Lineregdata <- fitdata[fitdata[,"LoB"]=="Home",]

regdata <- regdata[,colnames(regdata) %in% c("ultimateLAE","expectedLAE","devYears","incurredLoss","osRatio")]f <- as.formula(ultimateLAE ~ devYears + incurredLoss + osRatio)lm <- lm(f,data = regdata)summary(lm)

f <- as.formula(expectedLAE ~ devYears + incurredLoss + osRatio)lm <- lm(f,data = regdata)summary(lm)

###This is for Auto Lineregdata <- fitdata[fitdata[,"LoB"]=="Auto",]

regdata <- regdata[,colnames(regdata) %in% c("ultimateLAE","expectedLAE","devYears","incurredLoss","osRatio")]regdata$ultimateLAE <- exp(regdata$ultimateLAE)regdata$expectedLAE <- exp(regdata$expectedLAE)f <- as.formula(ultimateLAE ~ devYears + incurredLoss + osRatio)lm <- lm(f,data = regdata)summary(lm)

f <- as.formula(expectedLAE ~ devYears + incurredLoss + osRatio)lm <- lm(f,data = regdata)summary(lm)

40