WOW World of Walkover-weight “My God, it’s full of cows!” (David Bowman, 2001)

Preview:

Citation preview

WOWWOWWorld of Walkover-weightWorld of Walkover-weight

““My God, it’s full of cows!” My God, it’s full of cows!” (David Bowman, 2001)(David Bowman, 2001)

Can walkover-weight suggest a cow needs attention?

Join with breeding information …

Position at the outset …Position at the outset …

Obstacle:Obstacle: No health information!!! No health information!!!

Suggested:Suggested: Milking order (i.e. where a Milking order (i.e. where a cow is in the herd/line-up) is hierarchical cow is in the herd/line-up) is hierarchical and affected by health issuesand affected by health issues

Proposed goal:Proposed goal: to predict a drop in to predict a drop in milking order using WOW and other factsmilking order using WOW and other facts

Assumptions … deck of cardsAssumptions … deck of cards

Same cows come in for milking each timeSame cows come in for milking each time

Cows are well-behaved (e.g. arrive in a Cows are well-behaved (e.g. arrive in a nice queue)nice queue)

Data is in good shape (e.g. one reading Data is in good shape (e.g. one reading per cow per milking)per cow per milking)

Data problemsData problems

Multiple entries for cows (e.g. four entries Multiple entries for cows (e.g. four entries for 22719193 in QBH2005)for 22719193 in QBH2005)

Delete duplicate weights (SQL problem?)Delete duplicate weights (SQL problem?) Cow skipped and recycled back into orderCow skipped and recycled back into order Use average if more than one valueUse average if more than one value

About a quarter of the data are zeroes About a quarter of the data are zeroes ……

instancesinstances 0 weights0 weights λλ weights weights

BBYG2006BBYG2006 182,935182,935 57,28857,288 815815

BBYG2007BBYG2007 206,545206,545 39,72639,726 1,1931,193

JJVX2007JJVX2007 7,8507,850 00 7373

QBH2005QBH2005 7,8507,850 00 7373

QBH2006QBH2006 324,365324,365 80,36280,362 7272

QBH2007QBH2007 222,300222,300 67,10967,109 2,1182,118

QBH2008QBH2008 48,53448,534 10,53510,535 224224

““zero” problemszero” problems

Differentiate between a missing cow, a Differentiate between a missing cow, a missing weight and a “zero” weightmissing weight and a “zero” weight

Ignore missing cowsIgnore missing cows Cow skipped and recycled back into orderCow skipped and recycled back into order Time-based interpolationTime-based interpolation

Can be problematic if cow has been missing for a Can be problematic if cow has been missing for a whilewhile

Add flag to indicate weight was “guessed”Add flag to indicate weight was “guessed”

other issues in data preparationother issues in data preparation

Change milking date to Change milking date to milk indexmilk index Change birthdate to Change birthdate to age in monthsage in months Change parturition date to Change parturition date to days since last days since last

calvedcalved Additional derivativesAdditional derivatives

milking indexmilking index - cow’s position in milk order - cow’s position in milk order ∆∆-index-index – change in index for a cow over various – change in index for a cow over various

time periods (1, 3 and 7 days)time periods (1, 3 and 7 days) mu-weightmu-weight – average weight over varying-length – average weight over varying-length

periods (3, 7, 14, 21 and 28 milkings)periods (3, 7, 14, 21 and 28 milkings) ∆∆-mu-weight-mu-weight – change in index for a cow (1, 3, and – change in index for a cow (1, 3, and

7 days)7 days)

Does [change in] milk order correlate to WOW?

Correlation coefficients QBH2006 Correlation coefficients QBH2006 (dense)(dense)

WOW to index == 0.12WOW to index == 0.12 WOW to 14-day mu-weight == 0.93WOW to 14-day mu-weight == 0.93 Index to 10-day mu-weight == 0.14Index to 10-day mu-weight == 0.14 3-day 3-day ∆∆-order to -order to ∆∆-weight == 0.045-weight == 0.045

3-day ∆∆-order and 3-day ∆∆-weight

Predict change in milking orderPredict change in milking order

Use Use M5PM5P to predict how the milking order to predict how the milking order will change for a cow at the next milkingwill change for a cow at the next milking

Approx. 205,000 Approx. 205,000 QBH2006QBH2006 samples (with samples (with fewer than 5/25 missing attributes)fewer than 5/25 missing attributes)

2/32/3 training training 1/31/3 testing testing

Re-running took too long … but … you’ve all seen it before,where accuracy was 51.89% (discrimination 0.527) andthe model tree was hugely ugly (65 nodes, 33 leaves).

Also tried predicting cow’s index as decile and as ratio to herdsize.

<missing results go here when available><missing results go here when available>

0

0.2

0.4

0.6

0.8

1

1.2

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85

Cow’s position (index) as ratio to herdsizeCow’s position (index) as ratio to herdsize

QBH2008 Tag:17102150

0

200

400

600

800

1000

1200

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85

Cow index vs. herd sizeCow index vs. herd size

Where to? ….Where to? ….

Data must still be scrubbed so that milking Data must still be scrubbed so that milking order makes sense (if milking order is going to order makes sense (if milking order is going to be relevant)be relevant)

Perhaps cow order needs to be described in Perhaps cow order needs to be described in completely different terms (e.g. cow buddies)completely different terms (e.g. cow buddies)

Easy visualization of Easy visualization of herds/cows/breeds/dates/trends is neededherds/cows/breeds/dates/trends is needed

this segued into another area of the project ..this segued into another area of the project ..

Visualization tools (alpha and beta)Visualization tools (alpha and beta)

In the meantime … health data is obtained …

Can WOW predict onset of illness?Can WOW predict onset of illness?

Combine original attributes and Combine original attributes and derivatives with health judgmentsderivatives with health judgments

Cows with Cows with unknown healthunknown health are are considered healthyconsidered healthy

Need equal number of positive and Need equal number of positive and negative instancesnegative instances

Health data becomes availableHealth data becomes available

farmfarm yearyear Qty > 50Qty > 50

BBYGBBYG 20062006 7373

BBYGBBYG 20072007 9595

BBYGBBYG 20082008 220220

QBHQBH 20052005 113113

QBHQBH 20062006 282282

QBHQBH 20072007 481481

QBHQBH 20082008 253253

Not so much health dataNot so much health data

1613 recorded instances of health1613 recorded instances of health 913 different cows with health info913 different cows with health info 2540 cows with milking info2540 cows with milking info 788 milked cows with health data788 milked cows with health data 7 broad categories of illness:7 broad categories of illness:

Calving disorderCalving disorder Metabolic disorderMetabolic disorder Udder disorder (only one with >50 in herd)Udder disorder (only one with >50 in herd) Reproductive disorderReproductive disorder LamenessLameness Infectious diseasesInfectious diseases Other ailmentsOther ailments

Data sparsenessData sparseness

QBH2006QBH2006 75 instances out of 324,291 have 75 instances out of 324,291 have healthhealth

63 udder disorder63 udder disorder 10 metabolic disorder10 metabolic disorder 2 lameness 2 lameness

Only .002% positives Only .002% positives → → will never be isolated will never be isolated → → must subsample negativesmust subsample negatives

Random selection of 75 negatives Random selection of 75 negatives → data → data sparseness sparseness → over-fitting likely→ over-fitting likely

Data sparsenessData sparseness

QBH2006QBH2006 36 cows have illness at some time, so just learn 36 cows have illness at some time, so just learn

those? those?

11,966 records for those cows, 76 of which have 11,966 records for those cows, 76 of which have illness (still <1% positive)illness (still <1% positive)

Random selection of 1% as negatives (about 120)Random selection of 1% as negatives (about 120)

Refinements to approachRefinements to approach

QBH2006QBH2006

Restrict target objective to Restrict target objective to UDDER DISORDERUDDER DISORDER

Randomly select equal number of negatives from Randomly select equal number of negatives from cows cows who have health problem at some pointwho have health problem at some point

goal: goal: differentiate between healthy and unhealthy differentiate between healthy and unhealthy statestate

Detecting mastitis amidst random normal cowsDetecting mastitis amidst random normal cows

QBH2006QBH2006

Restrict learning objective to UDDER DISORDER Restrict learning objective to UDDER DISORDER

Randomly select equal number of negatives from all Randomly select equal number of negatives from all cows that have been milked (63+,63-)cows that have been milked (63+,63-)

When is a cow sick?When is a cow sick?

So far, attempted to predict health label So far, attempted to predict health label at point of milking, but ..at point of milking, but .. … … when was the health label attached? when was the health label attached?

beforebefore, , duringduring or or afterafter the current milking? the current milking?

Goal: predict whether cow needs Goal: predict whether cow needs attention at the attention at the next milkingnext milking (i.e. time (i.e. time series)series)

=== Summary ===Correctly Classified Instances 90 70.3125 %Incorrectly Classified Instances 38 29.6875 %Kappa statistic 0.4026Mean absolute error 0.3446Root mean squared error 0.4532Relative absolute error 68.8933 %Root relative squared error 90.5974 %Total Number of Instances 128 === Detailed Accuracy By Class ===TP Rate FP Rate Precision Recall F-Measure ROC Area Class 0.508 0.108 0.821 0.508 0.627 0.707 UDDER DISORDER 0.892 0.492 0.652 0.892 0.753 0.707 NONE

=== Confusion Matrix === a b <-- classified as 32 31 | a = UDDER DISORDER 7 58 | b = NONE

AgendaAgenda Replace quantified attributes with simpler (e.g. boolean, nominal) Replace quantified attributes with simpler (e.g. boolean, nominal)

onesones

Characterise exceptionsCharacterise exceptions Below average weight for cow/herd/breed/ageBelow average weight for cow/herd/breed/age Dropped decile/>50 in orderDropped decile/>50 in order

Broad statistical measuresBroad statistical measures How many std.devs. from meanHow many std.devs. from mean z-score (probability of variation)z-score (probability of variation)

Choose negative instances more carefully (select fewer Choose negative instances more carefully (select fewer interpolates)interpolates)

Spend more time with people who know cowsSpend more time with people who know cows

Recommended