53
Analysis of Alpaca Analysis of Alpaca Fiber Data Fiber Data By By Ying Luo and Ying Luo and Kristen Swinton Kristen Swinton

Analysis of Alpaca Fiber Data

  • Upload
    tayte

  • View
    81

  • Download
    0

Embed Size (px)

DESCRIPTION

Analysis of Alpaca Fiber Data. By Ying Luo and Kristen Swinton. Background. Alpacas are Llama-like animals bred in South America from which fibers are collected to make wool. Different properties of the fibers affect the quality of the wool. Response Variables. - PowerPoint PPT Presentation

Citation preview

Page 1: Analysis of Alpaca Fiber Data

Analysis of Alpaca Fiber Analysis of Alpaca Fiber DataData

ByBy

Ying Luo andYing Luo and

Kristen SwintonKristen Swinton

Page 2: Analysis of Alpaca Fiber Data

BackgroundBackground Alpacas are Llama-Alpacas are Llama-

like animals bred in like animals bred in South America from South America from which fibers are which fibers are collected to make collected to make wool.wool.

Different properties Different properties of the fibers affect of the fibers affect the quality of the the quality of the wool.wool.

Page 3: Analysis of Alpaca Fiber Data

Response VariablesResponse Variables Our client was interested in both the Our client was interested in both the

tensile strength and the scale of the tensile strength and the scale of the fibers.fibers.

Tensile strength is a measure of the Tensile strength is a measure of the breaking strength of the fiber.breaking strength of the fiber.

Scale is a measure of the distance Scale is a measure of the distance between the “scales,” or cells on the between the “scales,” or cells on the surface of the fiber.surface of the fiber.

Page 4: Analysis of Alpaca Fiber Data

More About ScalesMore About Scales

The scales are important as they The scales are important as they help the fibers interlock to form felt.help the fibers interlock to form felt.

Fibers with more scales—and Fibers with more scales—and consequently a shorter distance consequently a shorter distance between them—are more likely to between them—are more likely to have been damaged.have been damaged.

Hence, a higher scale measure Hence, a higher scale measure indicates healthier fibers.indicates healthier fibers.

Page 5: Analysis of Alpaca Fiber Data

Explanatory VariablesExplanatory Variables

The client suspected that the breed The client suspected that the breed of Alpaca used and the diet it is fed of Alpaca used and the diet it is fed might affect the tensile strength and might affect the tensile strength and scale of the fibers.scale of the fibers.

Data was collected for 22 Alpacas of Data was collected for 22 Alpacas of two breeds—Suri and Huacaya.two breeds—Suri and Huacaya.

Page 6: Analysis of Alpaca Fiber Data

Explanatory VariablesExplanatory Variables

One of the diets of interest is a diet One of the diets of interest is a diet meant to simulate the diet of an meant to simulate the diet of an animal in the wild. This diet is animal in the wild. This diet is referred to as the “low nutrition referred to as the “low nutrition diet.”diet.”

The other diet is a more typical diet The other diet is a more typical diet of animals raised in captivity. This is of animals raised in captivity. This is the “high nutrition diet.”the “high nutrition diet.”

Page 7: Analysis of Alpaca Fiber Data

CovariatesCovariates

Data was also collected on the Data was also collected on the animals’ gender, age at the animals’ gender, age at the beginning of the study, and color.beginning of the study, and color.

These factors were included in the These factors were included in the analysis along with the original analysis along with the original explanatory variables.explanatory variables.

Page 8: Analysis of Alpaca Fiber Data

ComplicationsComplications

Many observations were taken on each Many observations were taken on each animal. To simplify the analysis, we took animal. To simplify the analysis, we took the mean tensile strength and mean scale the mean tensile strength and mean scale for each animal.for each animal.

Breed and gender were confounded—Breed and gender were confounded—there were no female Suris in the sample.there were no female Suris in the sample.

Unbalanced data further complicated the Unbalanced data further complicated the analysis.analysis.

Page 9: Analysis of Alpaca Fiber Data

Time PeriodsTime Periods

Data was collected one year after the Data was collected one year after the study began. This is called period four.study began. This is called period four.

More data was collected two years into the More data was collected two years into the study. This is period eight.study. This is period eight.

We analyzed each set of data separately We analyzed each set of data separately and compared the results.and compared the results.

Page 10: Analysis of Alpaca Fiber Data

Strategy of AnalysisStrategy of Analysis

We started with the tensile strength data.We started with the tensile strength data. After analyzing the complete set of data, we After analyzing the complete set of data, we

analyzed two subsets—males only and whites analyzed two subsets—males only and whites only.only.

The males only analysis was done to The males only analysis was done to eliminate the breed and gender confounding.eliminate the breed and gender confounding.

The whites only analysis was suggested by The whites only analysis was suggested by the client in an attempt to balance the data.the client in an attempt to balance the data.

Page 11: Analysis of Alpaca Fiber Data

Tensile Strength AnalysisTensile Strength AnalysisComplete DataComplete Data

We started with the data from period We started with the data from period four.four.

We first tried to fit models with only one We first tried to fit models with only one factor and assess their significance.factor and assess their significance.

Two factors had significant F-tests: Two factors had significant F-tests: breed and gender.breed and gender.

A gender only model does not give any A gender only model does not give any information about the factors of information about the factors of primary interest.primary interest.

Page 12: Analysis of Alpaca Fiber Data

Complete DataComplete Data

Diet was not Diet was not significant alone, significant alone, but did have a but did have a significant significant interaction with interaction with gender.gender.

It appears that the It appears that the low nutrition diet low nutrition diet produced higher produced higher tensile strengths for tensile strengths for both genders.both genders.

LowHigh

f m

12

13

14

15

Gender

Diet

Te

nsile

Str

eng

th

Figure One: Diet by Gender Interaction Plot

Page 13: Analysis of Alpaca Fiber Data

Complete DataComplete Data

As with diet, color As with diet, color did not have a did not have a significant effect on significant effect on tensile strength by tensile strength by itself.itself.

It did, however, It did, however, interact with breed.interact with breed.

410360209205204201100

17

16

15

14

13

12

11

10

9

8

7

ColorS

cale

Figure Two: Breed by Color Interaction

HuacayaSuri

Breed

Page 14: Analysis of Alpaca Fiber Data

Possible ModelsPossible ModelsModel 1

ijiij BY F = 21.59 p-value = 0.0002 Model

2 ( )ijkl i j k ik ijklY D B G DG F = 6.42 p-value = 0.0024 Model

3Y B C BCij k i j ij ijk ( )

F = 8.02 p-value = 0.0006

Page 15: Analysis of Alpaca Fiber Data

Period EightPeriod Eight

The period eight data The period eight data produced the same produced the same three models.three models.

The p-values were a The p-values were a bit larger, but still bit larger, but still significant at the 0.05 significant at the 0.05 level.level.

This suggests no This suggests no significant effect of significant effect of the passage of one the passage of one year.year.

Model F p-valueModel 1 6.3 0.0208Model 2 7.73 0.001Model 3 4.16 0.0115

Period Eight

Page 16: Analysis of Alpaca Fiber Data

Males Only—Period FourMales Only—Period Four

Model 1 (breed only) had a p-value of Model 1 (breed only) had a p-value of 0.0036.0.0036.

Because this data only has one Because this data only has one gender, there is no longer any gender, there is no longer any confounding of breed and gender.confounding of breed and gender.

Since this model is significant, there Since this model is significant, there may be a true breed effect.may be a true breed effect.

Page 17: Analysis of Alpaca Fiber Data

Other Males Only ModelsOther Males Only Models

Model 2 is no longer appropriate Model 2 is no longer appropriate since it includes the gender effect.since it includes the gender effect.

In examining Model 3, we noticed In examining Model 3, we noticed that the breed by color interaction that the breed by color interaction was no longer significant.was no longer significant.

This is possibly because nearly all This is possibly because nearly all the Huacaya males were white, while the Huacaya males were white, while there was only one white Suri male.there was only one white Suri male.

Page 18: Analysis of Alpaca Fiber Data

Period EightPeriod Eight

Similarly to period four, only Model 1 Similarly to period four, only Model 1 fit the data reasonably well.fit the data reasonably well.

Its p-value was 0.0755, which is Its p-value was 0.0755, which is greater than 0.05, but the sample greater than 0.05, but the sample size was rather small (14).size was rather small (14).

Page 19: Analysis of Alpaca Fiber Data

Whites Only—Period FourWhites Only—Period Four

Model 1 was again highly significant with a Model 1 was again highly significant with a p-value of 0.0009.p-value of 0.0009.

In fitting Model 2, we discovered that the In fitting Model 2, we discovered that the diet by gender interaction was no longer diet by gender interaction was no longer significant.significant.

The reason for this is unclear, but could be The reason for this is unclear, but could be due to confounding of color with either due to confounding of color with either diet or gender.diet or gender.

Model 3 is clearly not applicable here as it Model 3 is clearly not applicable here as it contains a color main effect.contains a color main effect.

Page 20: Analysis of Alpaca Fiber Data

Whites Only—Period EightWhites Only—Period Eight

Results were similar to those of Results were similar to those of period four.period four.

The p-value for Model 1 was 0.0085.The p-value for Model 1 was 0.0085.

Page 21: Analysis of Alpaca Fiber Data

Tensile Strength Tensile Strength ConclusionsConclusions

Breed appeared to be Breed appeared to be an important factor an important factor in all the models.in all the models.

A boxplot of the A boxplot of the distribution of tensile distribution of tensile strength for each strength for each breed illustrates this breed illustrates this apparent effect.apparent effect.

Huacaya Suri

6

8

10

12

14

16

18

Breed

Te

nsile

Str

engt

h

Figure Three: Tensile Strength for each Breed

Page 22: Analysis of Alpaca Fiber Data

Breed or Gender?Breed or Gender?

Because of the Because of the confounding, we confounding, we cannot be sure that cannot be sure that the apparent effect the apparent effect is truly due to is truly due to breed.breed.

A plot of the tensile A plot of the tensile strength for each strength for each breed of the males breed of the males only data can clarify only data can clarify this issue.this issue.

Huacaya Suri

6

11

16

Breed

Ten

sile

Str

engt

h

Males OnlyFigure Four: Tensile Strength for each Breed,

Page 23: Analysis of Alpaca Fiber Data

Summary of Tensile Summary of Tensile Strength ConclusionsStrength Conclusions

Breed seems to have a significant effect Breed seems to have a significant effect on tensile strength, but it is confounded on tensile strength, but it is confounded with gender.with gender.

In looking at the males only, this effect is In looking at the males only, this effect is still present.still present.

This effect is the same when white animals This effect is the same when white animals are examined alone.are examined alone.

Huacayas produce fibers with higher Huacayas produce fibers with higher tensile strength on average than do Suris.tensile strength on average than do Suris.

Page 24: Analysis of Alpaca Fiber Data

Analysis of Scale DataAnalysis of Scale Data

Analysis was more difficult for the Analysis was more difficult for the scale data, with different subsets scale data, with different subsets producing different models.producing different models.

We began our analysis as before by We began our analysis as before by looking for single factor models for looking for single factor models for the period four data first.the period four data first.

The only single factor with a The only single factor with a significant F-test was age.significant F-test was age.

Page 25: Analysis of Alpaca Fiber Data

Scale Data—Complete Scale Data—Complete

As with the tensile As with the tensile strength data, there strength data, there was a significant was a significant interaction between interaction between diet and gender.diet and gender.

The low nutrition The low nutrition diet produces higher diet produces higher scale for males, but scale for males, but lower scale for lower scale for females.females.

LowHigh

f m

7.8

8.8

9.8

Gender

Diet

Sca

le

Figure Five: Diet by Gender Interaction Plot

Page 26: Analysis of Alpaca Fiber Data

Scale Data—CompleteScale Data—Complete

Color was not significant by itself, but Color was not significant by itself, but it was significant when added to the it was significant when added to the age only model.age only model.

There was no significant interaction There was no significant interaction between color and age.between color and age.

Page 27: Analysis of Alpaca Fiber Data

Possible ModelsPossible ModelsModel 4Y Aij i ij

F = 5.35 p-value = 0.0315

Model 5Y A D G DGijk l i j k jk ijk l ( )

F = 3.48 p-value = 0.0300

Model 6Y A Cijk i j ijk

F = 4.44 p-value = 0.0085

Page 28: Analysis of Alpaca Fiber Data

Period EightPeriod Eight

Model 4 (age only) fit well with a p-value of Model 4 (age only) fit well with a p-value of 0.00180.0018

Model 5 no longer fit because there was no Model 5 no longer fit because there was no longer a significant diet by gender longer a significant diet by gender interaction.interaction.

Gender, however, still had a significant Gender, however, still had a significant effect when added to the age only model, effect when added to the age only model, but there was no age by gender interaction.but there was no age by gender interaction.

This may be another side effect of the This may be another side effect of the unbalanced data.unbalanced data.

Page 29: Analysis of Alpaca Fiber Data

Period EightPeriod Eight

Although it did not fit well for period Although it did not fit well for period four, the breed only model (Model 1) four, the breed only model (Model 1) fit the period eight data quite well.fit the period eight data quite well.

Another significant model contained Another significant model contained the main effects for age and breed, the main effects for age and breed, but not their interaction.but not their interaction.

Page 30: Analysis of Alpaca Fiber Data

Period Eight ModelsPeriod Eight ModelsModel 4

Y Aij i ij F = 12.99 p-value = 0.0018

Model 7

ijkjiijk GAY F = 9.39 p-value = 0.0015

Page 31: Analysis of Alpaca Fiber Data

Period Eight ModelsPeriod Eight Models

Model 1

ijiij BY F = 6.67 p-value = 0.0163

Model 8

ijkjiijk BAY

F = 11.19 p-value = 0.0006

Page 32: Analysis of Alpaca Fiber Data

Data ProblemsData Problems

There were a couple of rather large There were a couple of rather large observations, which upon observations, which upon investigation appeared to be investigation appeared to be mistakes.mistakes.

The client agreed that they were The client agreed that they were probably recording mistakes, so we probably recording mistakes, so we threw them out.threw them out.

Page 33: Analysis of Alpaca Fiber Data

Other ProblemsOther Problems

The standard deviation of the The standard deviation of the measurements in period eight was measurements in period eight was half that of period four (after outliers half that of period four (after outliers were discarded).were discarded).

This could be due to the data This could be due to the data collector becoming more skilled in collector becoming more skilled in operating the machinery.operating the machinery.

Page 34: Analysis of Alpaca Fiber Data

Males Only—Period FourMales Only—Period Four

Neither Model 4 nor Neither Model 4 nor 6 fit the data well.6 fit the data well.

A diet only model A diet only model became significant became significant for the males only.for the males only.

A diet by breed A diet by breed interaction was interaction was also significant for also significant for the males only.the males only.

LowHigh

SuriHuacaya

10

9

8

Breed

Diet

Sca

le

(Males)Figure Seven: Diet by Breed Interaction Plot

Page 35: Analysis of Alpaca Fiber Data

Males Only Models—Period Males Only Models—Period FourFour

Model 9

ijiij DY F = 8.05 p-value = 0.0150

Model 10

( )ijkl i j k jk ijklY A D B DB

F = 8.06 p-value = 0.0048

Page 36: Analysis of Alpaca Fiber Data

Males Only—Period EightMales Only—Period Eight

In period eight, the diet only model In period eight, the diet only model (Model 9) was no longer significant. (Model 9) was no longer significant. Model 10 was not significant either.Model 10 was not significant either.

The age only model (Model 4) had a The age only model (Model 4) had a p-value of 0.0276.p-value of 0.0276.

Model 1 (breed only) had a p-value of Model 1 (breed only) had a p-value of 0.0954 which is not strictly 0.0954 which is not strictly significant, but is worth noting.significant, but is worth noting.

Page 37: Analysis of Alpaca Fiber Data

Whites Only—Period FourWhites Only—Period Four

The only model that The only model that had a significant F -had a significant F -test for the white test for the white animals only was a animals only was a model containing model containing the main effects of the main effects of age and diet along age and diet along with their with their interaction.interaction.

1 2 3 4 5 6 8 11

7

8

9

10

11

Sca

leAge in Years

Figure Eight: Age by Diet Interaction

Diet

Low

High

Page 38: Analysis of Alpaca Fiber Data

Whites Only—Period EightWhites Only—Period Eight

Curiously, the model that fit the Curiously, the model that fit the period four data did not fit the period period four data did not fit the period eight data very well.eight data very well.

The only model that had a The only model that had a reasonable fit was the breed only reasonable fit was the breed only model (Model 1) with a p-value of model (Model 1) with a p-value of 0.0513.0.0513.

Page 39: Analysis of Alpaca Fiber Data

Whites Only ModelsWhites Only Models

Model 11 (Period 4) ( )ijk i j ij ijkY A D AD

Model 1 (Period 8)

F = 6.88 p-value = 0.0132

ijiij BY F = 4.9 p-value = 0.0513

Page 40: Analysis of Alpaca Fiber Data

Scale ConclusionsScale Conclusions

It is difficult to make overall It is difficult to make overall conclusions about the factors affecting conclusions about the factors affecting scale.scale.

One factor that appeared in most One factor that appeared in most models was age.models was age.

This suggests that age does have This suggests that age does have some effect on scale.some effect on scale.

How age interacts with the other How age interacts with the other factors is not clear.factors is not clear.

Page 41: Analysis of Alpaca Fiber Data

Effect of AgeEffect of Age

It appears that older It appears that older animals have lower animals have lower scale measurements scale measurements than younger than younger animals.animals.

This suggests that This suggests that older animals have older animals have more damaged more damaged fibers.fibers.

118654321

12

11

10

9

8

7

6

Age in Years

Sca

le

Figure Nine: Scale Distribution at Each Age

Page 42: Analysis of Alpaca Fiber Data

Scale ConclusionsScale Conclusions

The other effect worth The other effect worth mentioning is the mentioning is the breed effect.breed effect.

It was not significant It was not significant in period four, but had in period four, but had moderate significance moderate significance in period eight.in period eight.

Suris seem to have Suris seem to have higher scale higher scale measurements than measurements than Huacayas, on average.Huacayas, on average.

Huacaya Suri

8

9

10

Breed

Sca

le

Breed (Period 8)Figure Ten: Scale Distributions for Each

Page 43: Analysis of Alpaca Fiber Data

DiagnosticsDiagnostics

It would be impractical to do residual It would be impractical to do residual analyses for all eleven models fit to analyses for all eleven models fit to all six subsets of data.all six subsets of data.

Since the client was primarily Since the client was primarily interested in the white animals, we interested in the white animals, we only checked the residuals for the only checked the residuals for the four models fit to the whites only four models fit to the whites only data.data.

Page 44: Analysis of Alpaca Fiber Data

Model 1—Tensile StrengthModel 1—Tensile StrengthPeriod FourPeriod Four

A normal probability plot of the data had one A normal probability plot of the data had one unusual observation with a residual of –3.59.unusual observation with a residual of –3.59.

The animal (W-35) that generated this The animal (W-35) that generated this observation had an unusually small tensile observation had an unusually small tensile strength.strength.

There was no reason to eliminate the point.There was no reason to eliminate the point. Our assumption of normally distributed Our assumption of normally distributed

errors, therefore, may not be valid. errors, therefore, may not be valid. Conclusions should be accepted with caution.Conclusions should be accepted with caution.

Page 45: Analysis of Alpaca Fiber Data

Model 1—Tensile StrengthModel 1—Tensile StrengthPeriod FourPeriod Four

A plot of residuals vs. fits also A plot of residuals vs. fits also indicated this unusual observation.indicated this unusual observation.

If that point is ignored, the variance If that point is ignored, the variance appears constant.appears constant.

Page 46: Analysis of Alpaca Fiber Data

Model 1—Tensile StrengthModel 1—Tensile StrengthPeriod EightPeriod Eight

The normal probability plot for period eight The normal probability plot for period eight looked much better than the one for period looked much better than the one for period four.four.

The residual for W-35 was –2.98.The residual for W-35 was –2.98.

The scale measurement for this observation The scale measurement for this observation was similar for both time periods.was similar for both time periods.

The residuals vs. fits plot also looked better.The residuals vs. fits plot also looked better.

Page 47: Analysis of Alpaca Fiber Data

Model 11—ScaleModel 11—ScalePeriod FourPeriod Four

The normal probability plot had a The normal probability plot had a possible outlier.possible outlier.

The residual for this outlier was only The residual for this outlier was only –1.67 —not too worrisome.–1.67 —not too worrisome.

The residual vs. fits plot looked okay.The residual vs. fits plot looked okay.

Page 48: Analysis of Alpaca Fiber Data

Model 1—ScaleModel 1—ScalePeriod EightPeriod Eight

The normal probability plot was not The normal probability plot was not quite linear, but had no obvious quite linear, but had no obvious outliers.outliers.

Since the F-test is robust for violations Since the F-test is robust for violations of the normality assumption, we feel of the normality assumption, we feel that this is not a problem.that this is not a problem.

The residual vs. fits plot showed no The residual vs. fits plot showed no evidence of non-constant variance.evidence of non-constant variance.

Page 49: Analysis of Alpaca Fiber Data

Final ConclusionsFinal Conclusions

It appears that breed is a significant It appears that breed is a significant predictor of tensile strength, despite predictor of tensile strength, despite its confounding with gender.its confounding with gender.

Huacayas generally produce fibers of Huacayas generally produce fibers of higher tensile strength than Suris.higher tensile strength than Suris.

Page 50: Analysis of Alpaca Fiber Data

Final ConclusionsFinal Conclusions

Conclusions for the scale data are not Conclusions for the scale data are not definite.definite.

Age seemed to be a significant predictor, Age seemed to be a significant predictor, with older animals producing fibers with with older animals producing fibers with lower scale.lower scale.

Breed was somewhat significant for period Breed was somewhat significant for period eight, but not for period four.eight, but not for period four.

For period eight, Suris have higher scale For period eight, Suris have higher scale measurements than Huacayas.measurements than Huacayas.

Page 51: Analysis of Alpaca Fiber Data

RecommendationsRecommendations

Recording errors and poor study Recording errors and poor study design made the analysis difficult design made the analysis difficult and cast doubt on the conclusions.and cast doubt on the conclusions.

We recommend some kind of We recommend some kind of randomized block design in future randomized block design in future studies.studies.

Gender could be the blocking factor Gender could be the blocking factor and age must either be balanced or and age must either be balanced or restricted to one level.restricted to one level.

Page 52: Analysis of Alpaca Fiber Data

Any Questions?Any Questions?

Page 53: Analysis of Alpaca Fiber Data

Special Thanks To . . .Special Thanks To . . .

Dr. Bill NotzDr. Bill Notz

Mrs. Amy ThompsonMrs. Amy Thompson

Ms. Liyan HuaMs. Liyan Hua