15
Review: Synergy between mechanistic modelling and data-driven models for modern animal production systems in the era of big data J. L. Ellis 1, M. Jacobs 2 , J. Dijkstra 3 , H. van Laar 2 , J. P. Cant 1 , D. Tulpan 1 and N. Ferguson 4 1 Department of Animal Biosciences, University of Guelph, Guelph, ON N1G 2W1, Canada; 2 Innovation Department, Trouw Nutrition, Boxmeer 5830AE, The Netherlands; 3 Animal Nutrition Group, Wageningen University, Wageningen 6708WD, The Netherlands; 4 Innovation Department, Trouw Nutrition, Puslinch, ON N0B 2J0, Canada (Received 17 July 2019; Accepted 3 February 2020; First published online 6 March 2020) Mechanistic models (MMs) have served as causal pathway analysis and decision-supporttools within animal production systems for decades. Such models quantitatively define how a biological system works based on causal relationships and use that cumulative biological knowledge to generate predictions and recommendations (in practice) and generate/evaluate hypotheses (in research). Their limitations revolve around obtaining sufficiently accurate inputs, user training and accuracy/precision of predictions on-farm. The new wave in digitalization technologies may negate some of these challenges. New data-driven (DD) modelling methods such as machine learning (ML) and deep learning (DL) examine patterns in data to produce accurate predictions (forecasting, classification of animals, etc.). The deluge of sensor data and new self-learning modelling techniques may address some of the limitations of traditional MM approaches access to input data (e.g. sensors) and on-farm calibration. However, most of these new methods lack transparency in the reasoning behind predictions, in contrast to MM that have historically been used to translate knowledge into wisdom. The objective of this paper is to propose means to hybridize these two seemingly divergent methodologies to advance the models we use in animal production systems and support movement towards truly knowledge-based precision agriculture. In order to identify potential niches for models in animal production of the future, a cross-species (dairy, swine and poultry) examination of the current state of the art in MM and new DD methodologies (ML, DL analytics) is undertaken. We hypothesize that there are several ways via which synergy may be achieved to advance both our predictive capabilities and system understanding, being: (1) building and utilizing data streams (e.g. intake, rumination behaviour, rumen sensors, activity sensors, environmental sensors, cameras and near IR) to apply MM in real-time and/or with new resolution and capabilities; (2) hybridization of MM and DD approaches where, for example, a ML framework is augmented by MM-generated parameters or predicted outcomes and (3) hybridization of the MM and DD approaches, where biological bounds are placed on parameters within a MM framework, and the DD system parameterizes the MM for individual animals, farms or other such clusters of data. As animal systems modellers, we should expand our toolbox to explore new DD approaches and big data to find opportunities to increase understanding of biological systems, find new patterns in data and move the field towards intelligent, knowledge-based precision agriculture systems. Keywords: digital agriculture, mechanistic modelling, machine learning, hybridization, animal production Implications Causal pathway-based mechanistic modelshave supported decision making and knowledge transmission in animal pro- duction for decades, but their role in the era of big dataand data scienceis unclear. This positional paper proposes that hybridization of modelling approaches represents a niche for animal production where our cumulative biological knowl- edge expressed in mechanistic models meets the emerging data collection and predictive potential of data science using machine learning and deep learning methods. While both approaches have strengths and unique niches, their true strength may lie in a synergistic relationship. Introduction The big data wave The term big datahas gained considerable attention in recent years, though its definition tends to differ across disciplines (Morota et al., 2018). Common themes in big datadefinitions are (1) volume: that the volume of data is so large that visual inspection and processing on a E-mail: [email protected] Animal (2020), 14:S2, pp s223s237 © The Animal Consortium 2020 doi:10.1017/S1751731120000312 animal s223

Review: Synergy between mechanistic modelling and data ... · Review: Synergy between mechanistic modelling and data-driven ... 3Animal Nutrition Group, Wageningen University, Wageningen

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Review: Synergy between mechanistic modelling and data ... · Review: Synergy between mechanistic modelling and data-driven ... 3Animal Nutrition Group, Wageningen University, Wageningen

Review: Synergy between mechanistic modelling and data-drivenmodels for modern animal production systems in the era of big data

J. L. Ellis1† , M. Jacobs2, J. Dijkstra3 , H. van Laar2, J. P. Cant1, D. Tulpan1 and N. Ferguson4

1Department of Animal Biosciences, University of Guelph, Guelph, ON N1G 2W1, Canada; 2Innovation Department, Trouw Nutrition, Boxmeer 5830AE, The Netherlands;3Animal Nutrition Group, Wageningen University, Wageningen 6708WD, The Netherlands; 4Innovation Department, Trouw Nutrition, Puslinch, ON N0B 2J0, Canada

(Received 17 July 2019; Accepted 3 February 2020; First published online 6 March 2020)

Mechanistic models (MMs) have served as causal pathway analysis and ‘decision-support’ tools within animal production systemsfor decades. Such models quantitatively define how a biological system works based on causal relationships and use thatcumulative biological knowledge to generate predictions and recommendations (in practice) and generate/evaluate hypotheses(in research). Their limitations revolve around obtaining sufficiently accurate inputs, user training and accuracy/precision ofpredictions on-farm. The new wave in digitalization technologies may negate some of these challenges. New data-driven (DD)modelling methods such as machine learning (ML) and deep learning (DL) examine patterns in data to produce accuratepredictions (forecasting, classification of animals, etc.). The deluge of sensor data and new self-learning modelling techniquesmay address some of the limitations of traditional MM approaches – access to input data (e.g. sensors) and on-farm calibration.However, most of these new methods lack transparency in the reasoning behind predictions, in contrast to MM that havehistorically been used to translate knowledge into wisdom. The objective of this paper is to propose means to hybridize thesetwo seemingly divergent methodologies to advance the models we use in animal production systems and support movementtowards truly knowledge-based precision agriculture. In order to identify potential niches for models in animal production of thefuture, a cross-species (dairy, swine and poultry) examination of the current state of the art in MM and new DD methodologies(ML, DL analytics) is undertaken. We hypothesize that there are several ways via which synergy may be achieved to advanceboth our predictive capabilities and system understanding, being: (1) building and utilizing data streams (e.g. intake, ruminationbehaviour, rumen sensors, activity sensors, environmental sensors, cameras and near IR) to apply MM in real-time and/or withnew resolution and capabilities; (2) hybridization of MM and DD approaches where, for example, a ML framework is augmentedby MM-generated parameters or predicted outcomes and (3) hybridization of the MM and DD approaches, where biologicalbounds are placed on parameters within a MM framework, and the DD system parameterizes the MM for individual animals,farms or other such clusters of data. As animal systems modellers, we should expand our toolbox to explore new DD approachesand big data to find opportunities to increase understanding of biological systems, find new patterns in data and move thefield towards intelligent, knowledge-based precision agriculture systems.

Keywords: digital agriculture, mechanistic modelling, machine learning, hybridization, animal production

Implications

Causal pathway-based ‘mechanistic models’ have supporteddecision making and knowledge transmission in animal pro-duction for decades, but their role in the era of ‘big data’ and’data science’ is unclear. This positional paper proposes thathybridization of modelling approaches represents a niche foranimal production where our cumulative biological knowl-edge expressed in mechanistic models meets the emergingdata collection and predictive potential of data science usingmachine learning and deep learning methods. While both

approaches have strengths and unique niches, their truestrength may lie in a synergistic relationship.

Introduction

The big data waveThe term ‘big data’ has gained considerable attention inrecent years, though its definition tends to differ acrossdisciplines (Morota et al., 2018). Common themes in ‘bigdata’ definitions are (1) volume: that the volume of datais so large that visual inspection and processing on a† E-mail: [email protected]

Animal (2020), 14:S2, pp s223–s237 © The Animal Consortium 2020doi:10.1017/S1751731120000312

animal

s223

Page 2: Review: Synergy between mechanistic modelling and data ... · Review: Synergy between mechanistic modelling and data-driven ... 3Animal Nutrition Group, Wageningen University, Wageningen

conventional computer is limited; (2) data types: mayinclude digital images, on-line and off-line video record-ings, environment sensor output, animal biosensor output,sound recordings, other unmanned real-time monitoringsystems as well as omics data (e.g. genomics, transcrip-tomics, proteomics, metabolomics and metagenomics)and (3) data velocity: the speed with which data are pro-duced and analysed, typically in real-time. In order to gaininsight from these large volumes of readily available data,it has become increasingly popular to apply data miningand machine learning (ML) methodologies to cluster data,make predictions or forecast in real-time. Thus, the topics‘big data’ and ML, though not explicitly tied together, oftenwork hand-in-hand.

The emergence of big data and its associated analytics isvisible in scientific referencing platforms such as Scopus,where yearly ‘big data’ references rose from 680 in 2012to 16 562 in 2018. When combined with the keywords‘cattle’, ‘pigs’ or ‘poultry’, the first reference to ‘big data’appears in 2011, but there are only 172 total referencesfrom 2011 to 2018 inclusive, indicating a much slowerdevelopment rate within animal production systems.Liakos et al. (2018) highlighted that 61% of publishedagriculture sector papers using ML approaches werefrom the cropping sector, 19% in livestock productionand 10% in each of soil and water science, respectively.There may be several reasons for the slower adoption ratein animal production systems, including the current degreeof digitalization, the utility offered, low/unclear value

proposition, return-on-investment (ROI) and the challengeof maintaining sensitive technology in corrosive, dusty anddirty environments.

The use of models in animal agricultureModels of all types (Figure 1) have a strong history of appli-cation in animal production, where their objectives havetypically revolved around optimally feeding and growinglivestock. For an excellent review of the historical evolutionof model development and use, for example, in ruminantnutrition, see Tedeschi (2019). Such a development pathhas largely been paralleled in swine and poultry (Dumaset al., 2008; Sakomura et al., 2015). These predominantlynutritional models have evolved to mathematically expressour cumulative biological knowledge on how a systemworks,developed in order to understand and manipulate nutrientdynamics in the animal. Numerous modelling groups acrossspecies, globally, have further developed detailed mechanis-tic models (MMs) to solve problems both within the scientificcommunity as well as in practice. In the field, they serve as‘decision support systems’ or ‘opportunity analysis’ plat-forms. Here, we might define ‘opportunity analysis’ as theability to examine a variety of scenarios for their potentialoutcomes, with the goal of improving performance, reducingcost and minimizing environmental impact (e.g. Ferguson,2015). These nutrition models have been modified, overgenerations, to account for specific concerns of their eras(e.g. environmental impact) via revisions, expansions andsub-model development, illustrating the sound structural

Figure 1 (colour online) Types of models and their classification. ML=machine learning, ANN= artificial neural network, SVM= support vector machine, HMM=hiddenMarkovmodel, PCA= principle component analysis, ICA= independent component analysis, ISOMAP= non-linear dimensionality reductionmethod, GLM=general linear model.

Ellis, Jacobs, Dijkstra, van Laar, Cant, Tulpan and Ferguson

s224

Page 3: Review: Synergy between mechanistic modelling and data ... · Review: Synergy between mechanistic modelling and data-driven ... 3Animal Nutrition Group, Wageningen University, Wageningen

nature of MMs – being based on biological principlesallows addition of content, as opposed to requiring completeredevelopment for a new innovation.

The role of mechanistic models in the big data eraGiven the push towards fully automated ML systems to inter-pret the new wave of big data available on-farm and thestrong predictive abilities of data-driven (DD) models, therole of MMs is occasionally questioned, as ML-based systemscan also be programmed to predict outcomes similar toMMs (e.g. milk yield, feed conversion ratio, etc.). While thereare places of overlap, it may be that a distinctive nicheremains for both MM and ML modelling approaches, as wellas opportunities for both academia and industry to profitfrom a hybridization or integration of the two approaches.However, this prospect means that the future animal sciencemodeller will require a revised toolbox to cope with advancesin technology, statistics and the business of animal produc-tion itself.

The objective of this paper is to examine the niche for bothconventional MM and new DD modelling approaches as wemove into an era of big data and to postulate on opportuni-ties for the two approaches to cooperate and propel the fieldof animal science forward. This paper will first examine theattributes and niches currently occupied by MM and newDD methods, and then postulate on possible approach inte-gration strategies, which may aid in the goal of producinganimal products more efficiently – both economically andenvironmentally.

Mechanistic modelling methodology

Mechanistic models seek to describe causation (thoughthey always contain empirical components). A MM maybe defined as an equation or series of equations, whichpredict(s) some aspect of animal performance based on itsunderlying principles (France, 1988). If the animal is consid-ered level ‘i’, then the organ may be considered level ‘i− 1’,the tissue level ‘i− 2’, the cell level ‘i− 3’ and the flock orherd may be considered level ‘iþ 1’. Therefore, a MMmay seek to describe level i observations with level i− 1 hier-archical representation. The general assumption with MMdevelopment is that behaviour of the system can be predictedfrom the sum of its modelled parts, and that we knowthe system’s causal pathways. In some instances, we maypostulate on the causal pathway via development of aMM and use the modelling/experiment exercise to decidewhether we consider this pathway ‘known’ or not.

France (1988) described three important attributes of sucha ‘hierarchical’ MM system:

1. Each level has its own language, concepts or principles. For example,the terms ‘forage DM’ and ‘pasture management’ have littlemeaning at the cell or organelle level;

2. Each level is an integration of items from lower levels. Discoveriesor descriptions at a lower level can be used at the higher level in

an explanatory way to aid understanding (mechanism andmode-of-action) and

3. Successful operation at the higher level requires the lower level tofunction properly but not necessarily vice versa. The exampleFrance (1988) provides is that of a cup – if it is smashed it willno longer function as a cup, but the molecular properties arehardly altered.

In development, the MM approach requires conceptualiza-tion of a hypothesis centred on how a system works, andthus how variables are interconnected. A MM is developedby looking at the structure of the system under investigation,distinguishing its key components and quantitativelyanalysing the behaviour of the whole system in terms ofits individual components and their interactions with oneanother. The general approach taken in the developmentof a MM could be summarized as follows:

1. Problem identification, hypothesis generation and definition ofmodel bounds. For example, a digestion kinetic model maychoose to focus onmacronutrient digestibility and ignore vitaminsand minerals or what the animal does with the macronutrientspost-absorption;

2. Model conceptualization. Graphical representations such asboxes and arrows are used to depict known or hypothesizedcausal pathways;

3. Data collection (extant or new) to develop the model, generallypiecewise, addressing small biological components of themodel (e.g. passage rate prediction within a digestion kineticmodel);

4. Model equations and assumptions: Deciding on the type ofequation to represent fluxes, assigning equations to the arrows,parameters to equations (science execution);

5. Model evaluation (statistical, graphical, sensitivity, behaviour andscenario) performed by modeller;

6. Repeat steps 1 to 5 for subsequent phases of model development.

Difficulty is often encountered discriminating betweenessential and nonessential components of a MM, in deter-mining the appropriate equation structure assigned to arelationship and when deciding on the required level ofmodel complexity – thus the need for clearly defined boun-daries and objectives from the outset. Often the model willinform the developer when additional complexity or recon-sideration of structure is required – that is, when it fails topredict an outcome. In this way, there is often a strongback-and-forth between model development and data-basedexperimental work –whereby data-based experimental workis used to parameterize a MM, and the MM may indicate inreturn where knowledge or data are missing from the body ofliterature in that area.

Such MMs are often coded using programming languagessuch as Fortran, C/Cþ/C#, Python, R and Delphi. It is commonthat MMs are also dynamic (though it is not a requirement),meaning that they simulate over time. This is often accom-plished through a series of integrated differential equations(e.g. Dijkstra et al., 1992) but can also be accomplishedby a series of ‘do’ loops (e.g. hourly and daily) within a code(e.g. Emmans, 1981).

Animal production models in the era of big data

s225

Page 4: Review: Synergy between mechanistic modelling and data ... · Review: Synergy between mechanistic modelling and data-driven ... 3Animal Nutrition Group, Wageningen University, Wageningen

Model evaluation is conducted separately from modeldevelopment, manually by the developer, and generallyincludes an assessment of model performance (precisionand accuracy), model behaviour and sensitivity analysis, aswell as the evaluation of a model’s ability to predict thecorrect response to ‘scenario’ data. This evaluation is accom-plished via the application of a range of evaluation metricsincluding root mean square prediction error, concordancecorrelation coefficient, residual plot analysis as well as a hostof other performance metrics and visualization strategies(e.g. Tedeschi, 2006). The goal is to determine how wellthe model performs across a range of existing data and,if applicable, to identify weak points, which require reassess-ment or modification. The model evaluation process may bemanual, performed by the developer (e.g. using the errormetrics indicated above), or if the exercise is to fit aMM to existing data, the model may be parameterized viafitting algorithms such as Levenberg–Marquardt, steepest-descent/gradient descent, Newton method, etc. to minimizea residual error metric.

Niche for mechanistic models

Opportunity analysisMechanistic models aim to represent causality in complexsystems. Historically, the most prominent niche occupiedby MMs in animal production has been that of solvingproblems where the intricacies in animal data are too greatto solve the problem experimentally. This is where represen-tation of the underlying biology of a system with a MMcan assist in determining the impact of a given change(e.g. when concurrent influencers cancel each other out).Mechanistic models have been used to solve problemssuch as: (1) identification of performance limiting factors;(2) determination of the optimal nutrient contents of a feed;(3) evaluation of management factors to optimize perfor-mance (Ferguson, 2015); (4) examine strategies such as thoseto reduce nutrient excretion into the environment (Pomar andRemus, 2019) or (5) forecasting outcomes in scenarios not yetseen in practice (Ferguson, 2015). Such MMs are developedto synthesize our biological knowledge into platforms(models) from which ‘wisdom’ and decision-making supportcan be gained. Within the context of this discussion, thedata-information-knowledge-wisdom pyramid of Ackoff(1989) (as illustrated recently in Tedeschi, 2019) is referencedto define knowledge and wisdom. Within this framework,data þ context = information, information þ meaning =knowledge, and knowledge þ insight = wisdom (discussedfurther below).

Performance optimizationIn practice, MMs are often linked to an optimizer – a codingalgorithm where, given a desired objective (e.g. minimizecost, maximize growth/milk production, efficiency or ROI),it will determine the optimal solution (e.g. diet formulation)(Ferguson, 2014) to achieve that outcome. Mechanistic

models linked to an optimizer may therefore also considerhow genetics, environment or management considerationsincorporated into the MM play into determining the optimalfeeding program, which takes their capabilities well beyondthe scope of traditional feed formulation software (e.g. leastcost formulation). Optimization programs essentially auto-mate the decision-making process – coincidentally, alreadya ‘hybridization’ of the MM and DD approach, via theiterative fitting algorithms used in optimization (similar tothe way a DD model may find a solution).

Understanding biological systemsIntellectually and academically, MMs provide animal scien-tists the opportunity to explore how a biological systemworks, extract meaningful information from data (e.g. meta-bolic fluxes from isotope enrichments (France et al., 1999)and thus increase our understanding of complex systemsand advance the whole field of animal science. They are oftenused to summarize experimental data to derive meaningfulparameters used in other applications, for example, frac-tional rates of rumen degradation (France et al., 2000) orspecific rates of mammary cell proliferation (Dijkstra et al.,1997). In research, MMs are not immune to failure. In fact,they are excellent tools for identifying areas where scientificknowledge is lacking, or where a hypothesis on the regula-tion of a system may be wrong. Failure of a MM to simulatereality indicates an area where the system has not beenappropriately described, and this could be due to a falseassumption, a lack of appropriate data or because the levelof aggregation at which the model runs is not appropriate forthe research question. When models interact iteratively withanimal experimentation, MMs assist movement of the wholefield forward by increasing our biological knowledge.

Approach limitations: mechanistic models

Current limitations of the conventional MM approach revolvearound their manual nature, extensive input requirementsand developer/end-user training requirements. From theend-user perspective, these challenges may mean thatMMs may not be user-friendly or approachable enough toguarantee use and acceptance. Some have suggested thatsuch problems are rooted in communication and trainingas opposed to user-friendliness (Cartwright et al., 2016),and protocols to improve the user experience, such as cus-tomer journey mapping, are recommended (Vasilieva, 2018).

Tedeschi (2019) reflected that innovation in MM for rumi-nant nutrition had gone stagnant since 2010 and proposedseveral reasons why, including (1) the field had reached acertain level of maturity or that (2) students are not beingproperly introduced to the required ‘systems thinking’approach. We further suggest that the ‘lag’ observed byTedeschi (2019) could have an additional origin related toreaching a level of success with nutritional models beyondwhich we need to integrate with other disciplines – genetics,epigenetics, health, animal management, environment,

Ellis, Jacobs, Dijkstra, van Laar, Cant, Tulpan and Ferguson

s226

Page 5: Review: Synergy between mechanistic modelling and data ... · Review: Synergy between mechanistic modelling and data-driven ... 3Animal Nutrition Group, Wageningen University, Wageningen

whole farm modelling and life cycle assessment – beforefurther knowledge (and then wisdom) can be extracted fromthe generated models.

Data-driven methodologies

In contrast to MMs, DD models structurally rely on correla-tions within a data set to determine the best combinationof input variables that predict the desired outcomes, basedon goodness-of-fit as opposed to mode-of-action or mecha-nism. Hence, their structure is ‘driven by the data’. Perhapsthe most widely known and applied DD method in animalnutrition is linear and non-linear regression, includingmeta-analysis. In the world of data science, such a modelrepresents a ‘supervised’ ML method (Figures 1 and 2,defined below). As modern regression methods are alreadycommonly applied in animal science, they will not be furtherdiscussed in this paper, except with respect to discussingwhere they fall in model classification schemes. Rather, theobjective here is to examine ‘new’ frontiers in DD modelling,which might alter or interact with MMs as the field moves for-ward. In many cases, advanced regression models might inter-act with new DD modelling methods in the same way. At theircore, DD regression models may contain some mechanisticelements, depending on the driving variables and selectedstructure (Sauvant and Nozière, 2016; Tjørve and Tjørve,2017). Also, although we often discuss DD v. MMs in black-or-white terms (France, 1988), the lines between mechanisticand DD models are blurry in practice.

A broader categorization of DD models (beyond tradi-tional regression) that fits with the scope of big data and datascience broached in this paper is presented in Figures 1 and 2.Data-driven models may be grouped based on the type oflearning used therein (supervised and unsupervised), thenature of the data (continuous and discrete) and the categoryof problems they solve (classification, regression, clusteringand dimensionality reduction). Conventional regressionmodelling (including linear regression, stepwise regression,multivariate regression, etc.), Bayesian models, classificationmodels and most artificial neural network (ANN) modelsconstitute supervised learning models, where the aim is topredict an output variable according to a ‘new’ set of

input variables (not previously encountered in training).In supervised learning, as it pertains to ‘big data’, ML systemsare presented with inputs and corresponding outputs, andthe objective is to construct a general rule, or model, whichmaps the inputs to outputs. Here, ML may be defined as theability for a machine to automatically detect patterns in datawithout being strictly programmed to do so (although itslearning algorithm is based on a pre-specified function).By ‘learning’, we refer to the ability for the model to improvepredictive performance through an iterative process overtime on a defined ‘training data set’. Such supervised DDmodels are developed via application of statistical fittingprocedures to minimize error between predictions and obser-vations (reviewed below). Compared to the MM approachwhere evaluation of model performance might be manual,ML systems automate this step and iterate towards thebest model. The performance of the MLmodel in a specific taskis defined by a performance metric such as minimizing theresidual error or mean square error, the same tools appliedwithinMM evaluations. This is called the ‘loss’ function, in datascience. The goal is that the ML model will be able to predict,classify or reduce the dimensionality of new data using theexperience obtained during the training process.

Conversely, in unsupervised learning, there is no distinc-tion between inputs and outputs, and the goal of the learningis to discover groupings in the data. ‘Clustering’ is a typeof unsupervised learning problem aimed to find naturalgroupings, or clusters, within data. Examples of clusteringtechniques include k-means (Lloyd, 1982), hierarchical clus-tering (Johnson, 1967) and the expectation maximization(Dempster et al., 1977). Dimensionality reduction, anotherunsupervised learning method, is the process of reducingthe number of variables under consideration by reducingtheir number to a set of principal variables (components).Principle component analysis (PCA) is a common exampleof a widely applied dimensionality reduction technique.

Across DD approaches, the steps in development aresimilar to those of MM model development, with the majorexception of hypotheses generation and model conceptuali-zation, and might be considered to be:

1. Objective identification (what is to be predicted),2. Database collection and management (e.g. sensor data, big data

and small data),3. Model encoding (inputs, modelling assumptions, rules, statistical

method, etc.),4. Model training and evaluation (statistical) – often performed

iteratively by the system itself, as opposed to manually, as discussedin the previous MM section.

Such DDmodels are coded in programming languages similarto those used to develop MMs such as FORTRAN, C/Cþ/C#,Python, R and Delphi.

A significant difference between the MM and DD method-ology lies in their interpretability. The MM is a fully ‘whitebox’ approach – the reasoning behind predictions is fullyvisible and the logic can be followed. However, most DDFigure 2 (colour online) Data-driven model classifications.

Animal production models in the era of big data

s227

Page 6: Review: Synergy between mechanistic modelling and data ... · Review: Synergy between mechanistic modelling and data-driven ... 3Animal Nutrition Group, Wageningen University, Wageningen

methodologies are ‘black box’, meaning a prediction is pro-duced, which may in fact be a very good prediction, but acausal explanation or rationale for the prediction is absent.

To delve further, this paper will describe in more detailclustering, ANNs and dimensionality reduction, as they aremethods commonly applied to interpret big data.

Clustering (unsupervised machine learning)Clustering algorithms, such as k-means and several variants,seek to learn the optimal division of groups of points from prop-erties of the data (Figure 3). The three basic steps of the pro-cedure are (1) initialization – k initial ‘means’ (centroids) aregenerated at random, (2) assignment – k clusters are createdby associating each observation with the nearest centroid and(3) update – the centroid of the clusters becomes the newmean. Steps (2) and (3) are repeated iteratively until conver-gence is reached, where the result is that the sum of squarederrors is minimized between points and their respective cent-roids. The ‘cluster centre’ is the arithmeticmean of all the pointsbelonging to the cluster, and each point must be closer to itsown cluster centre than to the other cluster centres.

Hierarchical clustering is a type of unsupervised ML algo-rithm used to cluster unlabelled data points. Similar tok-means clustering, hierarchical clustering group togetherdata points with similar characteristics. Dendrograms areused to follow the division of data into clusters (Figure 4).

In essence, a dendrogram is a summary of the distance matrixbetween all the items that must be clustered. A dendrogramcannot be used to infer the number of clusters. Rather, thatcan be established based on a user-defined threshold thatis equivalent to drawing a horizontal line through the den-drogram that will separate the items into a finite number ofclusters (Figure 5).

Artificial neural networks (mostly supervised machinelearning)The ANN is an example of a supervisedMLmodel. The ANN isdesigned to computationally mimic the perceived structureand function of neurons in the brain. These models attemptto emulate complex functions such as pattern recognition,cognition, learning and decision making (McCulloch andPitts, 1943). The brain does so via billions of neurons thatinter-connect, process and interpret information. The struc-ture of an ANN is therefore like a simplified version of thebiological neural network, whereby a number of ‘nodes’are arranged in multiple interconnected layers. Each nodeis able to integrate the provided input via a weighted sumupon which an activation function (typically logistic) isapplied. Exceeding the threshold of this activation functionprovides an output, which emulates the firing of a neuronin the brain.

An ANN must contain at least three components: (1) theinput layer, which inputs data into the system; (2) one ormore hidden layers, where the learning takes place and(3) an output layer, where the decision or prediction is pre-formed (Figure 6). Most types of ANNs take several inputs,process them through ‘neurons’ from a single or multiplehidden layer and return the result using an output layer.This result estimation process is known as ‘forward propagation’.The ANN then compares the result with an actual output,assesses the level of error and which neuron weight contrib-uted most to the total error and then adjusts the weightsand biases to minimize that error. This step is known as‘backwards propagation’. The weight and bias adjustmentuses the same algorithms as used for parameterizationof MM (e.g. Levenberg–Marquardt, full batch gradientdescent, stochastic gradient descent and scaled conjugategradient algorithm).

Figure 3 (colour online) Two distinct clusters of nine Canadian cities using the driving distance metric.

Figure 4 (colour online) Dendrogram representing the clustering of nineCanadian cities based on the driving distances among them.

Ellis, Jacobs, Dijkstra, van Laar, Cant, Tulpan and Ferguson

s228

Page 7: Review: Synergy between mechanistic modelling and data ... · Review: Synergy between mechanistic modelling and data-driven ... 3Animal Nutrition Group, Wageningen University, Wageningen

One of the most popular types of ANN is the multi-layerperceptron (MLP) (Rosenblatt, 1958; Aitkin and Foxall,2003). AMLP contains multiple ‘hidden layer’ nodes betweenthe input and the output layers of the ANN (Figure 7). In thistype of model, all layers are fully connected – every node in alayer is connected to every node in the previous and followinglayer (except the input and output layers). Models with multi-ple hidden processing layers are referred to as deep ANNs or‘deep learning’ (DL) models. These models can, like any ANN,be either supervised, partially supervised or unsupervised(e.g. where the ‘feature’ or node extraction is performedby the model itself).

Dimensionality reduction (unsupervised machine learning)With the advent of advanced computing techniques, high-throughput data collection hardware (sensors) and increas-ingly faster hardware platforms, large and complex datasets are generated that may include thousands of ‘features’or variables. The increase in the dimensionality of the datarequires an exponential increase in the number of observa-tions if all possible combinations of feature values are tobe included. This is almost impossible to achieve in practice,and the number of data features (p) is generally significantlylarger than the number of observations (n). A typical exampleof the problem appears in genome-wide association studieswhere the number of variables is in the tens of thousands(e.g. 50 K genotyping arrays), while the number of observa-tions is at least one order of magnitude lower. Two relatedbut distinct approaches are applied when such situationsoccur in practice: feature selection and feature engineering.Feature selection focuses on the selection of relevant featuresto be used in modelling the data. The relevance is establishedusing statistical (e.g. select high variance variables) orother data-focused techniques and criteria. Feature engineer-ing focuses on the generation of new features fromexisting features, by applying various linear (e.g. PCA, lineardiscriminant analysis and factor analysis) or non-linear(e.g. multi-dimensional scaling, isometric feature mapping andspectral embedding) transformations and operations on them.

Figure 5 (colour online) Equivalence between clusters and dendrogram interpretation.

Figure 6 Single layer artificial neural network (ANN) (perceptron), whereX1, X2, X3 represent inputs and W1, W2, and W3 represent weights.

Animal production models in the era of big data

s229

Page 8: Review: Synergy between mechanistic modelling and data ... · Review: Synergy between mechanistic modelling and data-driven ... 3Animal Nutrition Group, Wageningen University, Wageningen

As a common example, in a PCA analysis, a set of obser-vations of ‘possibly’ correlated variables are orthogonallytransformed into a set of linearly uncorrelated variablescalled ‘principal components’. The first principal compo-nent accounts for the largest proportion of outcomevariance, and each succeeding component in turn hasthe highest variance possible with the constraint that itis orthogonal to the preceding component. The resultingvectors are each a linear combination of the originalvariables which are uncorrelated. Similar to an ANN, theobjective of these models are to account for as much varia-tion as possible, but interpretation of the ‘meaning’ of theunderlying structure is difficult.

Niche for data-driven models

Machine learning methods have been shown to help solvemultidimensional problems with complex structures in thepharmaceutical industry and medicine, as well as in otherfields (LeCun et al., 2015). In this respect, they representa very powerful data synthesis technique. Similarly, Liakoset al. (2018) found in a review that within the agriculturesector, papers using ML approaches largely focused ondisease detection and crop yield prediction. The authorsreflected that the high uptake in the cropping sector likelyreflects the data intense nature of crop production and theextensive use of imaging (spectral, hyperspectral, near-IR,etc.). Based on a review of the available literature, currentlythe application ‘niche’ occupied by ML/big data models inanimal production revolve around problems that fall intothe following categories:

(i) Classification (supervised) or clustering (unsupervised) problems,where the model will group data based on common characteris-tics (features),

(ii) Prediction problems, where the models ‘learn’ a function (notnecessarily mathematical) that best describes (fits) the data or

(iii) Dimensionality reduction problems, where the model will selecta subset of features that best represent the data.

However, the categories mentioned above are really‘methodologies’ rather than niches and might be furthergeneralized within the animal production setting thereforeas ‘pattern recognition’ (encompassing classification andclustering) and ‘predictive ability’.

Pattern recognitionBroadly, DD methodologies demonstrate strength interpret-ing various types of novel data streams (e.g. audio, video andimage) to cluster, classify or predict based on supervisedor unsupervised approaches and mapping of patterns withinthe data. Within animal production, this has most notablybeen applied to animal monitoring and disease detection.

For example, a series of sensor types, with ML modelsbehind them to interpret and classify the data, have beendeveloped for use in practice to monitor changes in animalbehaviour (which may signify a change in health status,injury or heat) or are used for animal identification.Numerous publications have shown the ability of sensorsto classify animal behaviour (grazing, ruminating, resting,walking, etc.), for example, via 3-axis accelerometersand magnetometers (Dutta et al., 2015), optical sensors(Pegorini et al., 2015) or depth video cameras (Matthewset al., 2017), along with ML models to classify the collecteddata. As continuous human observation of livestock to theextent that a subtle change in behaviour could be observedand early intervention applied is often impractical, the nichefor automatedmonitoring systems to track animal movementand behaviour has formed.

As well, several other examples of how big data and MLhas been applied to the task of early disease detection can befound in the literature. Several researchers have developedANN models, which analyse poultry vocalizations in orderto detect changes and identify suspected disease statusearlier than conventionally possible. For example, Sadeghiet al. (2015) recorded broiler vocalizations in healthy andClostridium perfringens infected birds. The authors identifiedfive features (clusters of data) using an ANN model, whichshowed strong separation between healthy and infected

Figure 7 (colour online) Multi-layer artificial neural network (ANN).

Ellis, Jacobs, Dijkstra, van Laar, Cant, Tulpan and Ferguson

s230

Page 9: Review: Synergy between mechanistic modelling and data ... · Review: Synergy between mechanistic modelling and data-driven ... 3Animal Nutrition Group, Wageningen University, Wageningen

birds and were able to differentiate between healthy andinfected birds with an accuracy of 66.6% on day 2 and100% on day 8 after infection. Similar to vocalization, infec-tion may lead to detectable differences in movement patterns(Colles et al., 2016; optical flow analysis) and the surfacetemperature of animals (Jaddoa et al., 2019; IR thermalimaging), leading to earlier diagnosis of disease outbreak.

Predictive abilitiesData-driven methodologies have also found a niche in fore-casting and predicting, for example, numerical outcomes dueto their strong fitting abilities and ability to map even minutelevels of variation (e.g. within an ANN). As such, withinanimal production systems they are well situated to forecastperformance metrics of economic importance such as BW,egg production or milk yield. Alonso et al. (2015) used a sup-port vector machine classification model to forecast the BWof individual cattle, provided the past evolution of the herdBW is known. This approach outperformed individual regres-sions created for each animal in particular when there wereonly a few BWmeasures available and when accurate predic-tions more than 100 days away were required. Pomar andRemus (2019) and Parsons et al. (2007) as well as Whiteet al. (2004) proposed the use of a visual image analysissystem to monitor BW in growing pigs from which they coulddetermine appropriate feed allocations.

Approach limitations: machine learning

Over-fittingA significant limitation with ML/big data model training istheir tendency to over-train, where the ML model learnsthe ‘noise’ within the training data which leads to poorgeneralization capacity (the ability to make predictions on‘new’ data sets) (Basheer and Hajmeer, 2000). To overcomethis problem, the ‘gold standard’ is that the developmentprocess should include (1) a training data set, (2) an internalcross-validation data set and (3) a testing (external) data set.Within an ANN, the number of hidden layers is criticalfor adequate prediction but must be limited to deter theANN from learning such background ‘noise’ – however,the number of hidden layers is typically chosen by trialand error by the developer.

Data volume requirementsAs with all models, with DD models one must be cautious ofbiases introduced via skewed training data (e.g. to favourother characteristics, such as chicken age or backgroundnoise, Astill et al., 2018). Data-driven models are data-hungry, requiring large data sets for training and evaluationpurposes to rule out biases, noise and data imbalance.Ideally, these large data sets would also have a large variability,covering as many foreseeable scenarios as possible (Kamilarisand Prenafeta-Boldu, 2018).

Lack of transparencyPerhaps the most serious criticism of most ML models as theyare applied to big data is the lack of transparency in therationale behind each prediction. Although the means toobtain the prediction are known, via the algorithm used,the features responsible for causing the prediction, even froma mathematical point of view, are not always discernible(Knight, 2017). In this respect, the model represents a ‘blackbox’. To deal with this interpretability problem, severalmethods have been developed to show how ANNs buildup their understanding (see ‘explainable AI’, e.g. Sameket al. 2017). Examples of these include ‘feature visualization’(e.g. see Olah, 2017), ‘style transfer’ (e.g. Gatys et al., 2016)and inserting ‘attention mechanisms’ into the ANN to trackwhat it focuses on (e.g. Ilse et al., 2018). To gain knowledgefrom the information generated (according to the Ackoff(1989) pyramid), it is likely that these visualization methodswill need to be widely adopted.

Potential overlap and synergies between approaches

Tedeschi (2019) proposed that modelling may be entering asecond era of growth like the one experienced in the 1950s,the former being caused by the fourth industrial revolution.The slow but apparentmove towards a ‘smart industry’ in whichthe Internet of Things and robotics track and trace everythingthat is going on in a farm provides a second ‘boost’ to the appli-cation of mathematical modelling of all types in agriculture.

Table 1 summarizes the potential niche areas (by topic,according to the niches identified above) and relative(application) strengths of the two approaches. Some of thisniche differentiation is based on the data type – for example,within the realm of image, video and sound interpretation,DD/ML models can turn data into information via patternrecognition, whereas MMs cannot. At the other extreme,MMs are much more equipped to simulate what is ‘likely’to happen in a situation not previously seen in practice.(A DD model could not do so – predictions are based on whatit has been previously trained on.) However, the strength ofthis niche differentiation is further dependent on the digitalmaturity of the environment, as not all production systemsglobally will automate their data, and animal productionmight remain rather manual in somemarkets. This distinctioncreates another niche division between the approacheswhere both remain and have value.

Although the spectrum of digitalization is continuous,it could also be discussed within a categorical hierarchy ofincreasing digitalization:

1. No digitalization: Simple empirical models may preside, whichcan address major issues with easily manipulated equationsand minimal input data;

2. Manual data pipeline: A MM model with a custom developedfront-end may be most suitable;

3. A small digital pipeline with a limited number (1 or 2) of datastreams: May enable automatic input population for applicationof MM models (e.g. see the ‘precision agriculture’ section below);

Animal production models in the era of big data

s231

Page 10: Review: Synergy between mechanistic modelling and data ... · Review: Synergy between mechanistic modelling and data-driven ... 3Animal Nutrition Group, Wageningen University, Wageningen

4. A medium digital pipeline: Would mean combining different datasources with management systems delivering real-time data. Somesimulation functions of MMmay be replaced by real-time variation.The need for heavy-duty front-end development may be reduced.

5. A fully digitized pipeline: Would enable MM and DD models torun a farm, monitoring the status and automatically implementgenerated recommendations (e.g. Tesla’s AutoPilot and verticalfarming).

Within such a categorical analysis of the spectrum, a com-plete lack of digitalization (1) might suite the use of a simplemodel by, for example, a third party consultant, based on roughinformation provided (frommemory or estimation) by the farmer.This scenario is often present in under-developed countries.The development of a manual data pipeline (2) (e.g. measuringand recording of data) allows the development of improved‘benchmarking’ abilities – and therefore more reliable modelpredictions from such a simple empirical or MM. With a smallcontinuous digital pipeline (3), continuous optimization andreal-time benchmarking (as is being achieved now in precisionnutrition for swine, e.g. Pomar et al., 2015) becomes possible.It is at this point on the scale of digitalization that things becomeinteresting – for example, if BW is measured automatically,do we forecast BW from a DDmodel, or dowe use themeasuredBW data as an input to our MM to optimize – for example –ration composition? From this point forward, there also presentsa niche for hybridization of the DD and MM approaches, whichare further detailed below. A medium-sized digital pipeline(4) would allow real-time optimization of integral parts of the

MM engine and combine it with digital data. For example,a ration MM optimized off real-time or DD forecasted intakeand BW data and on the side utilizing an ANN on audio/videodata to determine, for example, health, activity level or behav-iour. A full digital pipeline (5) would allow monitoring and opti-mization of the entire system. This would allow a MM to beutilized as the basis and optimize/augment with DD learningpatterns never envisioned – straight up the ladder of causation(e.g. see Pearl & Mackenzie, 2018). Similar to vertical farming, itwould mean the automated management of a farm. To ourknowledge, (5) does not yet exist within animal productionsystems.

The most likely points for overlap between DD and MMapproaches are therefore (1) within a digitalized marketand (2) with those models which deal with predicting animalperformance and related intermediaries. This is becausewithin this realm, both DD and MM have demonstrated suc-cess (‘response prediction’ (MM) and ‘prediction abilities’(DD)). It might be important to consider again that while wedistinguish MM and DD approaches, they already have con-siderable technical overlap, though they use different termi-nology. For example, the ‘loss function’ is a residual errorterm, minimized with both supervised DD fitting methods(e.g. within an ANN) and the parameterization and optimi-zation of MM models. ‘Supervised ML’ models include com-monly applied methodologies such as regression techniques,‘unsupervised ML’ models include the PCA method, appliedfor decades in animal science. The real difference, asintended to be highlighted in this review, is their application

Table 1 Identified areas of strength for mechanistic modelling and data-driven models, as well as potential overlap

AreaMechanisticmodelling

Data-drivenmodelling

Nutrition: Balancing and optimizinguse and delivery of

Amino acids XEnergy XVitamins and minerals XNon-nutritional qualities of feed XNutrition ×Management × Environmentinteractions

X

Feed additives XNutrition × Health interactions X

Environment Temperature X XHumidity X XVentilation X XBedding X X

Sustainability Nitrogen X XPhosphorous X XGreenhouse gases X X

Variance/uniformity Raw materials XManufacturing XAnimal & carcass attributes X

Benchmarking Monitoring performance over time XForecasting in real-time X

Health Disease outbreak status XMortality/condemnations X

Animal management Monitoring animal health XEpigenetics (e.g. early life nutrition) X X

Ellis, Jacobs, Dijkstra, van Laar, Cant, Tulpan and Ferguson

s232

Page 11: Review: Synergy between mechanistic modelling and data ... · Review: Synergy between mechanistic modelling and data-driven ... 3Animal Nutrition Group, Wageningen University, Wageningen

in conjunction with big data and modern computing power.The most appropriate approachmay therefore depend largelyon the market and the objective of developing and deployingthe model. If the objective is forecasting performance formanagement decision purposes (e.g. what date will a flockof broilers reach market weight, based on their currentgrowth?), then DD approaches may be most appropriate,with their powerful fitting and forecasting abilities (givenenough historical data). If the objective is to manipulatethe system (e.g. nutritional formulation), problem solveand troubleshoot, or ‘increase knowledge’ (academic,‘why’ and ‘how’ do things work as they do?) then likelythe cumulative biological knowledge present in MM, perhapshybridized with DD, will yield the most information to theuser. While some argue that DD/‘big data’ can, and will,occupy this niche of ‘opportunity analysis’ and ‘decisionsupport’ occupied by MM, the data science field has yet toachieve success in this area. It will likely be considerable timeuntil (1) the required level of digitalization is achieved inanimal production systems and (2) a DD can use that datato inform causality (Pearl & Mackenzie, 2018).

Whereas each core approach (MM and DD) has its funda-mental limitations, a potential hybridization (as suggestedaround step (3) in the degrees of digitalization of a market)may enable a solution that has more to offer than the sumof its parts. In fact, such hybridizations are already beingdeveloped, largely within the realm of ‘precision agriculture’(discussed further in the sections below). By integrating thecausal pathways of MMmodels with the more advanced andmore widely deployable learning algorithms of DD models,hybrids could model the entirety of variables that play a rolein both the animal and the farm, increasing both knowledgeand wisdom generation. The following section will go furtherto propose additional routes by which these two fieldsmay be joined along the pathway to full digitalization inthe near future.

Data-driven models may serve as inputs to mechanisticmodelsA mainstream criticism of MMs has always revolved aroundthe type, amount and difficulty of obtaining accurate inputsand outputs. With the big data wave and increased prevalenceof sensor data, this will become more manageable. Extensivefeed characterization is increasingly available via rapid technol-ogies (e.g. Foskolos et al., 2015), in-pen/walk-over weigh scalesallow frequent weight data collection (e.g. Dickinson et al.,2013), wearable accelerometers allow monitoring of feedingactivity and animal behaviour (Borchers et al., 2016), environ-mental sensors allow real-time collection of environmental data(temperature, humidity and air quality) and several biosensorseven allowmonitoring of internal conditions (Neethirajan et al.,2017). A notoriously difficult area of prediction, particularly inruminants, remains dry matter intake (DMI) (e.g. Halachmi etal., 2004). At the simplest level, the ability to get reasonableestimates or actual values of DMI in non-controlled conditionshas incredible value to the development, use and application ofMMs. To the modeller aiming to understand biology (and thus

how it can be harnessed for a desired outcome), the big datawave represents a new era in data resolution.

The requirement for interaction betweenMMand true ‘bigdata’ will be that (1) MMs will need to be deployed in cloud-based data pipelines, (2) the time-step of the data will haveto be aligned with the time-step of the MM (or vice versa) –either the data significantly simplified or the time-stepconsidered within MM reduced to model changes withinday, (3) the reliability and error around sensor data pre-interpreted by ML need to be known (e.g. body fat estimationfrom cameras) and (4) MMs may have to prepare (andinnovate) for different data types (e.g. behaviour data).

This interaction between big data and MM has, in fact,already been accomplished and demonstrated most notablyin swine, where precision feeding systems utilize sensor dataand iterate with a MM to determine the optimal blend of twocontrasting feeds for individual animals. For example, Pomaret al., (2015) and Parsons et al., (2007) describe a precisionfeeding system for swine, whereby amodel with both mecha-nistic and empirical components (where empirical may beclassified as a DD approach such as regression) is providedwith real-time BW and feed intake data from the barn.Within this framework, an empirical model uses up-to-datedata for each pig to estimate daily starting estimates forBW, feed intake and daily gain. These forecasted valuesare then entered into the MM to estimate standardized ilealdigestible lysine and other amino acid requirements, as wellas the optimal concentration of these nutrients in the feedfor the day. Based on the model-forecasted nutrient require-ments, animals are individually fed a blend of two feedstargeted to their optimized requirement.

Precision agriculture aims to deliver nutrients proportionalto each individual’s need in order to improve nutrient utiliza-tion, efficiency of production, uniformity and to reduce theimpact of farming on the environment (Bongiovanni andLowenberg-Deboer, 2004; Pomar and Remus, 2019), andthus has benefited hugely from the introduction of real-timemonitoring and sensor technologies. The general require-ments as outlined by Pomar et al. (2015) are (1) the abilityto precisely and rapidly evaluate the nutritional potential offeed ingredients; (2) real-time determination of individualanimal or group nutrient requirements (via a MM); (3) theability to formulate balanced diets which limit the amountof excess nutrients and (4) the concomitant adjustment ofdietary nutrient supply to match the requirements of individ-ual animals within the group. In order to feed animalsindividually, individual monitoring of at least BW (in-penweigh scales or visual image analysis) or milk production(depending on species) and daily feed intake (via animalidentification tags and individual feeders) is required andthere must be equipment in place to allow targeted nutrientdelivery to individual animals.

What may not have been attempted yet is the integration ofother ML-interpreted big data, such as health and reproductivestatus, activity (energy budget) and environment, which mayaccount further for between-animal or between-farm differencesin performance.

Animal production models in the era of big data

s233

Page 12: Review: Synergy between mechanistic modelling and data ... · Review: Synergy between mechanistic modelling and data-driven ... 3Animal Nutrition Group, Wageningen University, Wageningen

Mechanistic models may serve as inputs to data-drivenmodelsAnother way that the fields could hybridize might includeusing the outcomes from a MM as a potential input ‘node’value for a DD, for example, an ANN. In this way, some‘biological common sense’ might be supplied to the ANN,potentially useful for decision-making support. Considerationswould revolve around careful specification of the range ofadditional inputs considered to avoid over-parameterization,collinearity and variables serving as inputs to the MM and alsoas inputs to the ANN directly. Particularly for topics problematicto MM prediction, this might represent an innovative approachto explore.

Another example might be in modelling supply chaindynamics. This is an example of an area where ML modelsmay be successfully applied at the highest level to assist deci-sion making within animal production systems (Stefanovic &Milosevic, 2017). Such supply chain models may be valuablefor (in particular) integrators, with complex logistical consid-erations and fluctuating market demands, whereby a largerML model framework may forecast market changes andthe resulting financial impact, and suggest course correctiveaction ‘now’ to augment negative impacts in the future.An imbedded or ‘consulted’ MM may be able to suggestfarm-level changes to augment outcomes, reduce variabilityor reduce loss. In this way, for the integrator of the future,MMs may provide ‘optimization’ capabilities within largedecision management platforms (e.g. when to slaughter,which farms to invest on for improvements, diet modifica-tions to hit targets/mitigate losses, etc.).

Mechanistic models used to interpret big ‘omics’ dataThere is a plethora of ‘omics’ data currently being generatedbut with minimal biological interpretation of their meaning(data translated into information) (Crow et al., 2019;Misra et al., 2018). Crow et al. (2019) examined thepredictability of differential gene expression and found thatthe same sets of genes keep coming up as differentiallyexpressed in experiments, regardless of the treatmentapplied. In trials that examine the rumen microbiome, wehave little knowledge of whether these microbial differenceshave actual implications for the animal (efficiency, CH4 andmilk FA profile). In these areas, omics data analysis maybenefit from teaming with MMs, to move towards interpre-tation of the generated big data. Such MMs might also beable to inform omics pathway analysis via, for example,metabolic control analysis (sensitivity analysis on a realsystem, as opposed to a simulated one), where control pointsin a pathway could be identified from differential geneexpression data. Academically, MMs may serve as a bridgebetween quantitative and qualitative analysis, such as this.

Full integration of mechanistic model code with machinelearning fittingAs we move towards ‘precision agriculture’ across species,one must ask whether that includes the ability to manipulateand optimize the system or simply have accurate real-time

information and the ability to forecast. If the former, therewill be the need for biological models, with the cumulativeknowledge about how a system works, to be fully integratedwith the ML models’ ability to make accurate predictions onindividual animals. One possible way to achieve this, maybe individual parameterization of MMs, via integratingbackwards propagation techniques applied within ANNs(for example) into a MM, while placing biological confidencelimits aroundMM parameters permitted to vary between ani-mals to prevent extending beyond biologically sensible val-ues. Knowledge of the biological uncertainty, and settingbounds to limit purely empirical fitting of the model, maybe critical to keep the integrity of the MM. Pomar andRemus (2019) propose a real-time closed feedback systemto determine nutrient requirements and subsequent feedintakes for individual pigs by combining both MM and ML.

The true challenge to widespread adoption of precisionfeeding in animal production is likely financial and logistical –requiring a substantial investment in the facility re-design andtechnology upgrade. Not until the ROI can be demonstrated,logistical barriers overcome and trust in the system is developed(there is much at stake within animal production if things gowrong) (Cartwright, 2016), is precision feeding likely to be takenup by the industry as a solution for the future.

A diagrammatic of how this integration might generatevalue is presented in Figure 8. Figure 8 summarizes theflow from a problem statement to data to information toknowledge to wisdom (as illustrated in the data-information-knowledge-wisdom pyramid of Ackoff, 1989) and how MMand ML may assist at different steps of that flow. In general,the ML/big data methods may independently only get usso far as the translation of data into information (perhapsup until knowledge – e.g. disease detection). TheMMmethodstranslate knowledge into wisdom but may lack sufficientinformation. Their integration would seemingly benefit bothrealms on the path to explainable AI.

Summary

This paper illustrates a niche for both MM and ML modellingapproaches within animal production based on (1) the wayin which we use models and (2) a varying degree of globaldigitalization, but that substantial opportunity also existsto expand the utilization and interpretation of data by hybrid-izing these approaches. Hybridization has already been initi-ated within the precision feeding sector but may be expandedto address other issues including those reviewed in thispaper, thereby moving the data science field closer to‘explainable AI’. Elshawi et al. (2019) comment that currentML models are not yet providing the end-user with any‘smartness’ in the decision-making process. Ching et al.(2018) observe that understanding how users should inter-pret models to make testable hypotheses about the MLmodel remains an ‘open challenge’. Therefore, we see adistinct niche for the hybridization of approaches (predictivestrength meets causality) in the near future.

Ellis, Jacobs, Dijkstra, van Laar, Cant, Tulpan and Ferguson

s234

Page 13: Review: Synergy between mechanistic modelling and data ... · Review: Synergy between mechanistic modelling and data-driven ... 3Animal Nutrition Group, Wageningen University, Wageningen

Second, what might also be evident now is that at theircore, MM and supervised ML models use many of the samemethodologies (e.g. model parameterization and optimiza-tion algorithms), though they may use them differentlyand apply different terminology. Many DD methodologiesare also decades old – but modern computing power andnew data streams have revolutionized their use.

Lastly, the expansion to utilize new methodologies andhybridization of approaches between MM and DD willrequire that the next generation of animal science modellersis provided with a revised toolbox, such that they can utilizethe emerging suite of DD modelling methodologies availableto them. To address this, universities must (1) adapt coursecontent to include training animal scientists in newand advanced DD methodologies (Xu and Rhee, 2014),(2) collaborate between departments to bridge the gapbetween data science and animal science and advocate formultidisciplinary teams or (3) accept passage of the batonto data scientists. The last option may limit the current levelof integration of modelling with experimental research andthe resulting advancement of knowledge we obtain via thiscollaboration.

AcknowledgementsThis study was enabled in part by support from the CanadaFirst Research Excellence Fund to the ‘Food from Thought’research program at the University of Guelph. We wouldalso like to thank the reviewers for their support andsuggestions to improve this manuscript. Primary resultsfrom this paper were previously published in abstract form(Ellis et al., 2019).

J. L. Ellis 0000-0003-0641-9622J. Dijkstra 0000-0003-3728-6885

Declaration of interestThe authors declare that there are no conflicts of interest.

Ethics statementNot applicable.

Software and data repository resourcesNo data or models were deposited in an official repository.

Figure 8 (colour online) Full integration of mechanistic and data-driven models in the translation of data → information → knowledge → wisdom (whereMM = mechanistic modelling, ML = machine learning modelling, dotted lines are feedback loops). Research may generate both data (1) and knowledge (3).‘Big data’ interpreted by ML may generate information (2), which in itself may be used for decision making but relies on correlation (thereby lacking knowledgeor wisdom). Within this framework, MMs may be seen as where biological knowledge is accumulated (3), and these may on their own be used to derive wisdom(4) – though they often lack sufficient driving information. Hybridization of ML andMM (*) may close the loop between prediction accuracy/precision and causality.

Animal production models in the era of big data

s235

Page 14: Review: Synergy between mechanistic modelling and data ... · Review: Synergy between mechanistic modelling and data-driven ... 3Animal Nutrition Group, Wageningen University, Wageningen

ReferencesAckoff, R 1989. From data to wisdom. Journal of Applied Systems Analysis16, 3–9.

Aitkin M and Foxall R 2003. Statistical modelling of artificial neural networksusing the multi-layer perceptron. Statistics and Computing 13, 227–239.

Alonso J, Villa A and Bahamonde A 2015. Improved estimation of bovine weighttrajectories using Support Vector Machine Classification. Computers andElectronics in Agriculture 110, 36–41.

Astill J, Dara RA, Fraser EDG and Sharif S 2018. Detecting and predictingemerging disease in poultry with the implementation of new technologiesand Big Data: a focus on Avian Influenza virus. Frontiers in VeterinaryScience 5, 1–12.

Basheer IA and Hajmeer M 2000. Artificial neural networks: fundamentals,computing, design, and application. Journal of Microbiological Methods 43,3–31.

Bongiovanni R and Lowenberg-Deboer J 2004. Precision agriculture andsustainability. Precision Agriculture 5, 359–387.

Borchers MR, Chang YM, Tsai IC, Wadsworth BA and Bewley JM 2016.A validation of technologies monitoring dairy cow feeding, ruminating, and lyingbehaviors. Journal of Dairy Science 99, 7458–7466.

Cartwright SJ, Bowgen KM, Collop C, Hyder K, Nabe-Nielsen J, Stafford R,Stillman RA, Thorpe RB and Sibly RM 2016. Communicating complex ecologicalmodels to non-scientist end users. Ecological Modelling 338, 51–59.

Ching T, Himmelstein DS, Beaulieu-Jones BK, Kalinin AA, Do BT, Way GP,Ferrero E, Agapow P-M, Zietz M, Hoffman MM, Xie W, Rosen GL, LengerichBJ, Israeli J, Lanchantin J, Woloszynek S, Carpenter AE, Shrikumar A, Xu J,Cofer EM, Lavender CA, Turaga SC, Alexandari AM, Lu Z, Harris DJ, DeCaprioD, Qi Y, Kundaje A, Peng Y, Wiley LK, Segler MHS, Boca SM, Swamidass SJ,Huang A, Gitter A and Greene CS 2018. Opportunities and obstacles for deeplearning in biology and medicine. Journal of The Royal Society Interface 15,20170387.

Colles FM, Cain RJ, Nickson T, Smith AL, Roberts SJ, Maiden MCJ, Lunn D andDawkins MS 2016. Monitoring chicken flock behaviour provides early warning ofinfection by human pathogen Campylobacter. Proceedings of the Royal SocietyB: Biological Sciences 283: 20152323.

Crow M, Lim N, Ballouz S, Pavlidis P and Gillis J 2019. Predictability of humandifferential gene expression. Proceedings of the National Academy of Sciences116, 6491–6500.

Dempster AP, Laird NM and Rubin DB 1977. Maximum Likelihood fromIncomplete Data via the EM Algorithm. Journal of the Royal StatisticalSociety. Series B (Methodological) 39, 1–38.

Dickinson RA, Morton JM, Beggs DS, Anderson GA, Pyman MF, Mansell PD andBlackwood CB 2013. An automated walk-over weighing system as a tool formeasuring liveweight change in lactating dairy cows. Journal of Dairy Science96, 4477–4486.

Dijkstra J, France J, Dhanoa MS, Maas JA, Hanigan MD, Rook AJ and Beever DE1997. A model to describe growth patterns of the mammary gland duringpregnancy and lactation. Journal of Dairy Science 80, 2340–2354.

Dijkstra J, Neal HDStC, Beever DE and France J 1992. Simulation of nutrientdigestion, absorption and outflow in the rumen: model description. TheJournal of Nutrition 122, 2239–2256.

Dumas A, Dijkstra J and France J 2008. Mathematical modelling inanimal nutrition: a centenary review. The Journal of Agricultural Science 146,123–142.

Dutta R, Smith D, Rawnsley R, Bishop-Hurley G, Hills J, Timms G and Henry D2015. Dynamic cattle behavioural classification using supervised ensembleclassifiers. Computers and Electronics in Agriculture 111, 18–28.

Ellis, J, Jacobs M, Dijkstra J, van Laar H, Cant J, Tulpan D and Ferguson N 2019.The role of mechanistic models in the era of big data and intelligent computing.Advances in Animal Biosciences 7, 285–367.

Elshawi R, Maher M and Sakr S 2019. Automated machine learning: state-of-the-art and open challenges. arXiv:1906.02287 [cs, stat].

Emmans GC 1981. A model of the growth and feed intake of ad libitum fedanimals, particularly poultry. British Society of Animal Production (BSAP)Occasional Publications 5, 103–110.

Ferguson NS 2014. Optimization: A paradigm change in nutrition and economicsolutions. Advances in Pork Production 25, 121–127.

Ferguson NS 2015. Commercial application of integrated models to improveperformance and profitability in finishing pigs. In Nutritional modelling for pigsand poultry (ed. NK Sakmoura, R Gous, L Kyriazakis and L Hauschild),pp. 141–156. CABI, Wallingford, UK.

Foskolos A, Calsamiglia S, Chrenková M, Weisbjerg MR and Albanell E 2015.Prediction of rumen degradability parameters of a wide range of forages andnon-forages by NIRS. Animal 9, 1163–1171.

France J 1988. Mathematical modelling in agricultural science. Weed Research28, 419–423.

France J, Dijkstra J, Dhanoa MS, Lopez S and Bannink A 2000. Estimatingthe extent of degradation of ruminant feeds from a description of their gas pro-duction profiles observed in vitro: derivation of models and other mathematicalconsiderations. British Journal of Nutrition 83, 143–150.

France J, Hanigan MD, Reynolds CK, Dijkstra J, Crompton LA, Maas JA, BequetteBJ, Metcalf JA, Lobley GE, MacRae JC and Beever DE 1999. An isotope dilutionmodel for partitioning leucine uptake by the liver of the lactating dairy cow.Journal of Theoretical Biology 198, 121–133.

Gatys LA, Ecker AS, Bethge M 2016. Image style transfer using convolutionalneural networks. In Proceedings of IEEE Conference on Computer Visionand Pattern Recognition (CVPR), 26 June–1 July 2016, Las Vegas, NV, USA,pp. 2414–2423.

Halachmi I, Edan Y, Moallem U and Maltz E 2004. Predicting feed intake of theindividual dairy cow. Journal of Dairy Science 87, 2254–2267.

Ilse M, Tomczak JM, Welling M 2018. Attention-based deep multiple instancelearning. Proceedings of the 35th International Conference on MachineLearning, PMLR, 10–15 July 2018, Stockholm, Sweden, pp. 2127–2136.

Jaddoa M, Al-Jumeily A, Gonzalez L and Cuthbertson H 2019. Automatictemperature measurement for hot spots in face region of cattle using infraredthermography. Proceedings of the 16th International Conference on Informaticsin Control, Automation and Robotics, 29–31 July, 2019, Prague, Czech Republic,pp. 196–201.

Johnson SC 1967. Hierarchical clustering schemes. Psychometrika 32,241–254.

Kamilaris A and Prenafeta-Boldu FX 2018. Deep learning in agriculture: a survey.Computers and Electronics in Agriculture 147, 70–90.

Knight W 2017. There’s a big problem with AI: even its creators can’t explainhow it works. MIT Technology Review. Retrieved on 27 June 2019 fromhttps://www.technologyreview.com/s/604087/the-dark-secret-at-the-heart-of-ai/

LeCun Y, Bengio Y and Hinton G 2015. Deep learning. Nature 521,436–444.

Liakos K, Busato P, Moshou D, Pearson S and Bochtis D 2018. Machine learningin agriculture: a review. Sensors 18, 2674.

Lloyd S 1982. Least squares quantization in PCM. IEEE Transactions onInformation Theory 28, 129–137.

Matthews SG, Miller AL, PlÖtz T and Kyriazakis I 2017. Automated tracking tomeasure behavioural changes in pigs for health and welfare monitoring.Scientific Reports 7, 17582.

McCulloch WS and Pitts W 1943. A logical calculus of the ideas immanent innervous activity. The bulletin of mathematical biophysics 5, 115–133.

Misra BB, Langefeld C, Olivier M and Cox LA 2018. Integrated omics: tools,advances and future approaches. Journal of Molecular Endocrinology 62,R21–R45.

Morota G, Ventura RV, Silva FF, Koyama M and Fernando SC 2018. Big Dataanalytics and precision animal agriculture symposium: machine learning anddata mining advance predictive big data analysis in precision animal agriculture.Journal of Animal Science 96, 1540–1550.

Neethirajan S, Tuteja SK, Huang S-T and Kelton D 2017. Recent advancement inbiosensors technology for animal and livestock health management. Biosensorsand Bioelectronics 98, 398–407.

Olah, C 2017. Feature Visualization. Retrieved on 8 July, 2019 from https://distill.pub/2017/feature-visualization/

Parsons DJ, Green DM, Schofield CP and Whittemore CT 2007. Real-time controlof pig growth through an integrated management system. BiosystemsEngineering 96, 257–266.

Pearl J and Mackenzie D 2018. The book of why: the new science of cause andeffect. Basic Books, New York, NY, USA.

Ellis, Jacobs, Dijkstra, van Laar, Cant, Tulpan and Ferguson

s236

Page 15: Review: Synergy between mechanistic modelling and data ... · Review: Synergy between mechanistic modelling and data-driven ... 3Animal Nutrition Group, Wageningen University, Wageningen

Pegorini V, Zen Karam L, Pitta C, Cardoso R, da Silva J, Kalinowski H,Ribeiro R, Bertotti F and Assmann T 2015. In vivo pattern classification of ingestivebehavior in ruminants using FBG sensors and machine learning. Sensors 15,28456–28471.

Pomar C, Pomar J, Rivest J, Cloutier L, Létourneau-Montminy M, Andretta I,Hauschild L, Sakomura NK, Gous RM and Kyriazakis I 2015. Estimating real-timeindividual amino acid requirements in growing-finishing pigs: towards a newdefinition of nutrient requirements in growing-finishing pigs? In InternationalSymposium: Modelling in Pig and Poultry Production, 18–20 June 2013,Sao Paulo, Brazil, pp. 1–19.

Pomar C and Remus A 2019. Precision pig feeding: a breakthrough towardsustainability. Animal Frontiers 9, 52–59.

Rosenblatt F 1958. The perceptron: A probabilistic model for information storageand organization in the brain. Psychological Review 65, 386–408.

Sadeghi M, Banakar A, Khazaee M and Soleimani M 2015. An intelligentprocedure for the detection and classification of chickens infected by clostridiumperfringens based on their vocalization. Revista Brasileira de Ciência Avícola 17,537–544.

Sakomura NK, Gous R, Kyriazakis I, Hauschild L 2015. Nutritional modelling forpigs and poultry. CABI, Wallingford, UK.

Samek W, Wiegand T and Müller K-R 2017. Explainable artificial intelligence:understanding, visualizing and interpreting deep learning models. arXiv:1708.08296 [cs, stat].

Sauvant D and Nozière P 2016. Quantification of the main digestive processesin ruminants: the equations involved in the renewed energy and protein feedevaluation systems. Animal 10, 755–770.

Stefanovic N, Milosevic D 2017. Model for big data analytics in supply chainmanagement. Proceedings of the 7th International Conference on InformationSociety and Technology ICIST, 12–15March, 2017, Kopaonik, Serbia, pp. 163–168.

Tedeschi LO 2006. Assessment of the adequacy of mathematical models.Agricultural Systems 89, 225–247.

Tedeschi LO 2019. ASN-ASAS symposium: Future of Data Analytics in Nutrition:mathematical modeling in ruminant nutrition: approaches and paradigms,extant models, and thoughts for upcoming predictive analytics,. Journal ofAnimal Science 97, 1921–1944.

Tjørve KMC and Tjørve E 2017. The use of Gompertz models in growth analyses,and new Gompertz-model approach: an addition to the Unified-Richards family.PLoS ONE 12, 1–17.

Vasilieva, EV 2018. Developing the creative abilities and competencies of futuredigital professionals. Automatic Documentation and Mathematical Linguistics52, 248–256.

White RP, Schofield CP, Green DM, Parsons DJ and Whittemore CT 2004.The effectiveness of a visual image analysis (VIA) system for monitoring theperformance of growing/finishing pigs. Animal Science 78, 409–418.

Xu M and Rhee SY 2014. Becoming data-savvy in a big-data world. Trendsin Plant Science 19, 619–622.

Animal production models in the era of big data

s237