15
Nutri-Educ, a nutrition software application for balancing meals, using fuzzy arithmetic and heuristic search algorithms Jean-Christophe Buisson a,b, * a University of Toulouse, INPT, IRIT (ENSEEIHT), 2 rue Camichel, 31071 Toulouse, France b Rangueil University Hospital, Toulouse, France Received 22 June 2007; received in revised form 10 December 2007; accepted 11 December 2007 Artificial Intelligence in Medicine (2008) 42, 213—227 http://www.intl.elsevierhealth.com/journals/aiim KEYWORDS Fuzzy arithmetic; Heuristic search; Nutrition; Nutri-Educ Summary Objective, methods and results: Nutri-Educ is a nutrition software application developed in cooperation with the Diabetology Department of Toulouse’s Rangueil University Hospital. It aims at helping any person to balance their meals. More specifically, its main goal is to enable a user to describe a meal and assess its content, and in most cases to find a small set of acceptable actions which make it well-balanced and in accordance to the user’s energetic needs. Fuzzy numbers are used to represent the inherent imprecision and fuzziness of food quantities and nutrient values as well as to model the gradual boundaries of the daily recommended values associated with each nutrient. Fuzzy arithmetic is used to perform computations on such quantities and fuzzy pattern matching provides measures of the compatibility of data to nutrient norms. Innovative visual gauges have been designed to display this information in a simple, yet comprehensive way. Finally, heuristic search algorithms are used to find a set of actions, acceptable from a nutritional point of view, which will transform the initial meal into a well-balanced one. Conclusion: Fuzzy arithmetic proves to be an adequate model for naturally repre- senting food quantities and values as well as for performing all necessary computation and compatibility assessments. By combining it with new interface techniques and heuristic search algorithms, it allows the Nutri-Educ software application to balance and improve meals, a complex qualitative and quantitative problem which is solved in 87% of our benchmark database. # 2007 Elsevier B.V. All rights reserved. * Correspondence address: University of Toulouse, INPT, IRIT (ENSEEIHT), 2 rue Camichel, 31071 Toulouse, France. Tel.: +33 561 588 200; fax: +33 561 588 306. E-mail address: [email protected] . 0933-3657/$ — see front matter # 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.artmed.2007.12.001

Nutri-Educ, a nutrition software application for balancing ...diabeto.enseeiht.fr/download/publis/aim_march2008.pdf · 2.3.1. Fuzzy quantities A fuzzy quantity (or fuzzy interval

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Nutri-Educ, a nutrition software application for balancing ...diabeto.enseeiht.fr/download/publis/aim_march2008.pdf · 2.3.1. Fuzzy quantities A fuzzy quantity (or fuzzy interval

Artificial Intelligence in Medicine (2008) 42, 213—227

http://www.intl.elsevierhealth.com/journals/aiim

Nutri-Educ, a nutrition software application forbalancing meals, using fuzzy arithmetic andheuristic search algorithms

Jean-Christophe Buisson a,b,*

aUniversity of Toulouse, INPT, IRIT (ENSEEIHT), 2 rue Camichel, 31071 Toulouse, FrancebRangueil University Hospital, Toulouse, France

Received 22 June 2007; received in revised form 10 December 2007; accepted 11 December 2007

KEYWORDSFuzzy arithmetic;Heuristic search;Nutrition;Nutri-Educ

Summary

Objective, methods and results: Nutri-Educ is a nutrition software applicationdeveloped in cooperation with the Diabetology Department of Toulouse’s RangueilUniversity Hospital. It aims at helping any person to balance their meals. Morespecifically, its main goal is to enable a user to describe a meal and assess its content,and inmost cases to find a small set of acceptable actions whichmake it well-balancedand in accordance to the user’s energetic needs.

Fuzzy numbers are used to represent the inherent imprecision and fuzziness of foodquantities and nutrient values as well as to model the gradual boundaries of the dailyrecommended values associated with each nutrient. Fuzzy arithmetic is used toperform computations on such quantities and fuzzy pattern matching providesmeasures of the compatibility of data to nutrient norms. Innovative visual gaugeshave been designed to display this information in a simple, yet comprehensive way.Finally, heuristic search algorithms are used to find a set of actions, acceptable from anutritional point of view, which will transform the initial meal into a well-balancedone.Conclusion: Fuzzy arithmetic proves to be an adequate model for naturally repre-senting food quantities and values as well as for performing all necessary computationand compatibility assessments. By combining it with new interface techniques andheuristic search algorithms, it allows the Nutri-Educ software application to balanceand improve meals, a complex qualitative and quantitative problem which is solved in87% of our benchmark database.# 2007 Elsevier B.V. All rights reserved.

* Correspondence address: University of Toulouse, INPT, IRIT (ENSEEIHT), 2 rue Camichel, 31071 Toulouse, France.Tel.: +33 561 588 200; fax: +33 561 588 306.

E-mail address: [email protected].

0933-3657/$ — see front matter # 2007 Elsevier B.V. All rights reserved.doi:10.1016/j.artmed.2007.12.001

Page 2: Nutri-Educ, a nutrition software application for balancing ...diabeto.enseeiht.fr/download/publis/aim_march2008.pdf · 2.3.1. Fuzzy quantities A fuzzy quantity (or fuzzy interval

214 J.-C. Buisson

1. Introduction

Improving on the long term the dietary habits ofpeople is a socially important task, which would helpto decrease the frequency of cardio-vascular dis-ease and the morbidity of many chronic diseasessuch as diabetes. Our group has been working for anumber of years on computer-based educationaltools for diabetics and people who are overweight[1]. Nutri-Educ is one of these tools, and possessesseveral unique features. First, it is the only one todeal with the inherent imprecision of nutritionaldata, and to use it in computations and assessmentsin a mathematically sound way. Very surprisingly,there is no other application software associatingnutrition and fuzzy sets to our knowledge; someworks [2] only describe how the concept of fuzzyset may help to define the ideal intake for foods.Secondly, Nutri-Educ is able to ‘fix meals’, that is tosay to compute sets of actions which transform ameal to make it well balanced. Many other com-mercial and research tools help people build dietplans and analyze the composition of meals, butsurprisingly, we did not find any others which dealwith this problem of balancing.

Wewill first describe fuzzy arithmetic as a tool forrepresenting and propagating imprecise data. Thenwe will focus on the interface methods used topresent results. Finally we will describe the balan-cing algorithm, and how Nutri-Educ integrates allthese aspects.

2. Fuzzy arithmetic and nutrition

2.1. Problem statement

In this part we are interested in using fuzzy setstheory [3] to represent and manipulate imprecisequantities. The values, which are used for the nutri-tional computations, are indeed affected by impre-cision, and sometimes by uncertainty; it affects theweights of the manipulated foods and also theirnutrient values. It also concerns the internationallyrecommended values which are intervals with gra-dual boundaries.

We first need a mathematically sound model,adapted to the representation of such values. Sev-eral potential theories exist, such as interval arith-metic [4] and classical fuzzy arithmetic [5]. Theirrespective advantages and disadvantages for theproblems our applications need to deal with havebeen examined and classical fuzzy arithmetic wasfound to be the better adapted of the two. Prob-ability theory was not selected as a possible solu-tion, since it is more adapted to the representation

of random uncertainty than imprecision, which isour central problem.

When such a set of representations of impreciseand/or uncertain values is stored in its memory, acomputer then needs to be able to operate on thesevalues. It should be able to do all the ordinaryarithmetic on this type of values: adding, subtract-ing, multiplying by constants (precise or imprecise),computing mean values, etc. [4—6]. The computerwill need to be able to compare imprecise valuesand also to be able to evaluate to which extent avalue belongs to an interval with gradual bound-aries. This belonging will not be a Boolean, nor asimple value nor an interval [7], but a set of possi-bility and necessity values.

These fuzzy computations will then be used in ageneralized heuristic search algorithm, which willbe able to find the smallest set of acceptable actionsto transform a given meal into a well-balancedmeal.

Finally, we will need interface methods for endusers and nutritionists in order for them to be able tointerpret themeaning and all the nuances containedin such representations, or to be able to perform adata collection task that will lead to the construc-tion of such representations. We believe thatadapted interface techniques can be an importantfactor for the acceptance of fuzzy models by a wideraudience. Our own research work in this field [8] willbe detailed here; it has been validated by severalsoftware applications, Nutri-Educ being one of them[9,10].

2.2. Sources of imprecision anduncertainty in nutrition

2.2.1. Food compositionMany food composition tables exist, both as paperdocuments and as computer databases. They areusually associated with a particular country, sinceeven common foods such as yoghurts come in dif-ferent sizes and have different ingredients, depend-ing on the country. Since 1892 in the United States,the Department of Agriculture (USDA) has beenkeeping, in a very organized and systematic manner,a freely accessible database which contains thecomposition of all raw and processed foods availablein the country [11,12]; more than 440,000 nutrientvalues are recorded for more than 6000 foods. Forreasons of future compatibility, and considering thequality of their database organization, we haveadopted a similar one in all our software applica-tions, which presently contains only French foodvalues.

There is often a problem of imprecision with thenutrient values but no problem of uncertainty: it is

Page 3: Nutri-Educ, a nutrition software application for balancing ...diabeto.enseeiht.fr/download/publis/aim_march2008.pdf · 2.3.1. Fuzzy quantities A fuzzy quantity (or fuzzy interval

Nutri-Educ software application to balance and improve meals 215

Figure 1 Extract from a food survey of a 10-year-oldchild. She uses naturally expressed portions (‘grappe’ = ‘-bunch’, ‘bout’ = ‘piece’), sometimes with a sizing adjec-tive (‘petit’ = ‘small’).

generally known which nutrients are present in afood, but the problem is to know in what amountsthey are present. If there is doubt as to the presenceof a particular nutrient, its supposed amount will bevery small, which takes us back to the problem of theimprecision of a near-zero value. This problem maybecome more acute for industrially processed foods,for which manufacturers give only partial informa-tion. In the American USDA food values table, eachquantity (a precise number) is given with two values:a measurement, and a standard deviation. The stan-dard deviation values are unfortunately often zero,which means that the values have been obtaineddirectly from the manufacturer, without verificationfrom an independent laboratory.

When the food has been processed and packagedin a precise and stable way (for instance oil, sugar,biscuits, etc.), the nutrient values are often knownprecisely and with certainty.

For all other foods, the nutrient value for 100 g ofa food referenced normally (for instance: ‘apple’,‘white bread’, ‘potatoes’) may vary over a rangethat may be fairly large, depending on the circum-stances. The carbohydrate quantity in 100 g ofapple, for example, depends on the variety of apple,as well as on its degree of ripeness. Even aftercontrolling for ripening and variety, there are var-iations due to soil and growing conditions.

2.2.2. Weights of foods eatenThe weights of foods absorbed by a person during ameal are often imprecisely known, as can be seenfor both foods mentioned in the following extract(Fig. 1) from a food survey of a 10-year-old child.

In both cases, the food quantity is evaluated usingits natural linguistic qualification: ‘‘bunch’’ forgrapes, ‘‘piece’’ for cheese. After, a sizing adjectivemay be added to this qualification: ‘‘a small bunchof grapes’’, ‘‘a big piece of cheese’’, etc.

More exhaustively, we have tried to verify thatthis mode of linguistic qualification of a food quan-tity is the most widely used, by studying the resultsof food surveys done on 1607 children of ages ran-

Figure 2 Categorization of nominal

ging from 10 to 14, in the setting of the ‘Nutri-Advice’ study which was conducted for 3 years at sixjunior high schools in the Toulouse area.

We have categorized in these surveys all nominalgroups which expressed a food quantity. Results aregiven in Fig. 2. It can be seen that the last threequalification classes are by far the largest ones.

2.2.3. Norms and internationalrecommendationsThe international medical research community per-iodically publishes norms and guidelines describingthe characteristics of a balanced diet for healthychildren and adults as well as for people sufferingfrom chronic illnesses which require particular dietrestrictions [13—15].

These norms indicate:

� T

gr

he daily caloric amount related to age, height,gender, level of physical activity, and (to a lesserextent) weight. This value is the amount of energya person must absorb everyday, not taking intoaccount the kinds of foods to be eaten.

� T

he amount of particular nutrients (carbohy-drates, fat, proteins, saturated fat, etc.) in thisdaily caloric amount. These values are oftenexpressed as a percentage of this total (forinstance, energy coming from carbohydratesshould represent 50% of total energy); it is some-times expressed as an absolute value for the day(for instance, a child under 12 needs 100 mg ofcalcium everyday, independent of her dailyenergy needs).

oups qualifying food quantities.

Page 4: Nutri-Educ, a nutrition software application for balancing ...diabeto.enseeiht.fr/download/publis/aim_march2008.pdf · 2.3.1. Fuzzy quantities A fuzzy quantity (or fuzzy interval

216 J.-C. Buisson

These norms are intervals with imprecise bound-aries.

2.3. Fuzzy arithmetic for therepresentation of imprecision andfuzziness in nutrition

In this part we will present fuzzy arithmetic andshow that it is well suited for representing imper-fectly known nutritional data. We will examine inparticular its ability to:

� r

Fihe

epresent as directly as possible, and with allnecessary nuances, the different forms of impre-cision or fuzziness found in food surveys, nutri-tional definitions and international norms andguidelines.

� fi

nely evaluate the compatibility of a piece ofdata within a category, a fundamental operationwhen assessing the balance of a real or hypothe-tical meal.

2.3.1. Fuzzy quantitiesA fuzzy quantity (or fuzzy interval or fuzzy number)is a fuzzy set [3] of real numbers, written M, with amembership function mM which is unimodal andupper semicontinuous, that is to say such as8a2]0, 1], Ma = {rjmM(r) � a} (the a-cut of M) is aclosed interval (Fig. 3).

A fuzzy interval generalizes the concept of aclosed interval, including ordinary real numbers.It can model the membership area of a variable xwith more sophistication than an ordinary interval.More precisely, the support S(M) = {rjmM(r) > 0} isthe largest membership area of x (x cannot take avalue outside S(M), whereas the kernel M ¼fr mMðrÞ ¼ 1j g is the set of the most plausible valuesfor x, also called modal values.

A fuzzy number is adapted to the representationof imprecise quantities. In most situations where aparameter must be evaluated (the value of which

gure 3 Fuzzy interval, shown with an a-cut Ma ofight a.

is not known with precision), an ordinary intervalis not satisfactory. If we choose a fairly largeinterval in order to be sure that the value is init, the subsequent calculations based on thisrepresentation may lead to results that are notspecific enough to be of real value. On the con-trary, if we choose too narrow an interval, byrestricting it to the most plausible values, the highprecision of the results might be incorrect if anerror has been made at the beginning and theactual value may not be included in the interval.A fuzzy interval allows both an optimistic and apessimistic view: the support of the number ischosen large enough to be sure that no value isneedlessly excluded with the kernel representingthe most plausible values.

The most important point is to determine boththe set of values which are completely impossible(the values for which m(x) equals 0) and the setof values which are completely possible (forwhich m(x) equals 1) precisely: the remainingsubsets of the domain will correspond to gradualtransitions.

Three different shapes for membership function(MF) are commonly used: triangular MF, trapezoidalMF and Gaussian-type MF. Triangular MF is excludedbecause the kernel is reduced to a singleton whereasit is often an interval in our application. Gaussian-type MF would perhaps represent fuzzy values moreexactly, but their support and kernel are difficult todefine precisely for dieticians for example and theylead to more intensive computations, especiallywhen running the balancing algorithm.

A trapezoidal form for the membership functionis enough to ascertain the most important proper-ties. From a computational point of view, we willmodel such functions by 4-uples (m, m, a, b)(Fig. 4).

Fig. 5 represents several typical situationsrelated to nutrition, with trapezoidal fuzzy num-bers.

Figure 4 Fuzzy interval modeled by a 4-uple (m, m, a,b).

Page 5: Nutri-Educ, a nutrition software application for balancing ...diabeto.enseeiht.fr/download/publis/aim_march2008.pdf · 2.3.1. Fuzzy quantities A fuzzy quantity (or fuzzy interval

Nutri-Educ software application to balance and improve meals 217

Figure 5 Typical situations of fuzzy nutritional values.

2.3.2. Implementation of fuzzy computationOur problem here is to compute the fuzzy quantitywhich limits the domain of a variable f(v), given thefuzzy quantity which limits the variable v. Here fcan be a common mathematical operation, unary orbinary.

When f is continuous, it can be shown thatfðM;NÞa ¼ fðMa;NaÞ (2)

In a word, the a-cut of the image is the image ofthe a-cuts and a fuzzy computation appears asinterval computations made for all possible values,from 0 to 1, from a prudent computation with thesupports (a = 0+), to a daring computation with thekernels (a = 1).

Eq. (2) allows us to compute f(M1, M2) for com-mon arithmetic operators, M1 and M2 being twotrapezoidal fuzzy numbers (m1, m1, a1, b1) and(m2, m2, a2, b2). For addition, for example, it iseasy to see that

ðm1;m1;a1;b1Þ þ ðm2;m2;a2;b2Þ

¼ ðm1 þ m2;m1 þm2;a1 þ a2;b1 þ b2Þ (3)

For multiplication and division, an approxima-tion is necessary, since the result is no longer

exactly trapezoidal. The two most importantelements of the result are the support and thekernel, which can be computed precisely. Theapproximation consists in making the results tra-pezoidal, by drawing straight lines between thesupport and the kernel, instead of linear by partshape of the exact result. This approximationleaves the order of the membership degrees on[0, 1] unchanged.

Moreover, Schneeman has shown [16] that anapproximation operator can be used, which pre-serves the expected interval and ordering. Weintend to use this operator in a near future, providedthat it does not increase computing time toomuch inthe balancing algorithm.

2.3.3. Fuzzy pattern matching with a singlevariableAfter having computed a fuzzy quantity, for examplethe total amount of energy in a meal, we need tocompare it to a norm, in order to assess whether it iscompatible. Let P and D be the pattern (the norm)and the data, represented by the possibility distri-butions mP and mD, respectively. It has been shown[17] that the following two scalar measures express

Page 6: Nutri-Educ, a nutrition software application for balancing ...diabeto.enseeiht.fr/download/publis/aim_march2008.pdf · 2.3.1. Fuzzy quantities A fuzzy quantity (or fuzzy interval

218 J.-C. Buisson

several important aspects of the compatibility ofdata D with pattern P:

PðP;DÞ ¼ supu

minðm pðuÞ;mDðuÞÞ (4)

NðP;DÞ ¼ 1�PðP;DÞ

¼ infu

maxðm pðuÞ; 1� mDðuÞÞ (5)

P(P; D) evaluates to which degree it is possible thatP and D refer to the same value; it is the measure ofthe overlapping of D and P. N(P; D) measures towhich degree it is impossible that D belongs to P; it isthe degree of inclusion of D into P. P(P; D) is anoptimistic measure of compatibility, whereas N(P;D) is pessimistic and requires verification.

Fig. 6 shows how these two values can beobtained geometrically from trapezoidal fuzzyvalues.

2.3.4. Fuzzy pattern matching with severalvariablesWe are now able to compute a set of compatibilityvalues P(Pi; Di) and N(Pi; Di) by separately matchingall data D1, D2i, . . ., etc., with the correspondingpattern P1, P2i, . . ., etc. Each pair Di, Pi representswhat is known about the value of a nutrient forexample, and the corresponding norm for this nutri-ent in the meal.

In our applications however, the nutritionistsassign different degrees of importance to the nutri-ents considered. For example, energy, saturated fatand carbohydrates are three nutrients which have afar greater importance than others for normaladults. The degree of relative importance may varywith the group considered: calcium is more impor-tant for children than adults, cholesterol level is

Figure 6 Geometrical determination of the possibilityand necessity values of the compatibility between data Dand pattern P.

very important for people with cholesterol pro-blems, etc.

Nutritionists spontaneously weigh the impor-tance for each individual norm; for example foran adult with no medical problems, they assign 4to energy, saturated fat and carbohydrates, 2 toprotein and 1 to all other nutrients.

Let us call v1, v2i, . . ., vn the respective weight-ing of patterns P1, P2, . . ., Pn, with 8i, vi � 1, andsuppose that at least one of the degrees vi is 1. Forexample, for an adult with no medical problems,these degrees are 1 for energy, saturated fat andcarbohydrates, 0.5 for protein and 0.25 for allothers.

If Si denotes the degree of compatibility (possi-bility or necessity) between Di and Pi, then theglobal compatibility degree can be computed by

s ¼ mini¼1;n

maxð1� viðuÞ; siÞ (6)

3. Man/machine interface for a fuzzysystem

Little research has been done in the field of man/machine interface for fuzzy systems, however, suchinterfaces are the most visible aspects of modelsusing fuzzy set theory. Intuitive interface techniques,ergonomically adapted to an application, would behelpful assets for the acceptance of fuzzy models.

3.1. Gauges for visualizing compatibilitybetween data and patterns

For a long time, messages such as ‘‘the amount ofcarbohydrates in your meal is higher than recom-mended’’ have been given to users. Yet this kind ofpresentation has always looked clumsy to us, and wehad the feeling that this impression was shared byusers, whether they were nutritionists, dieticians orpatients. Moreover, such sentences take quite a longtime to read and be understood and each time someaspect of the situation changes, a new sentence hasto be produced and read, etc.

Then we had the idea to use an analogy withmeasuring devices with needles (to measure inten-sity, pressure, etc.), which would display both ana-logical values and areas of admissible values.Wewillcall such devices ‘gauges’; they display both a fuzzyquantity D and an associated norm P (Fig. 7). At aglance, we can assess the compatibility of a valuewith a norm. When the value or the norm changes,the nature of the change is perceived immediatelyon the gauge. This feature of the visual system beingable to attract the attention of users allows for the

Page 7: Nutri-Educ, a nutrition software application for balancing ...diabeto.enseeiht.fr/download/publis/aim_march2008.pdf · 2.3.1. Fuzzy quantities A fuzzy quantity (or fuzzy interval

Nutri-Educ software application to balance and improve meals 219

Figure 7 The different areas of a gauge. Here, the data(needle) is slightly fuzzy, but is nevertheless completelyincluded in the norm kernel.

possibility that they could be used in monitoringsystems.

The data is represented by a needle of variablewidth. If it is precise, the needle is thin, with a widthof only one pixel. If the data is imprecise but notfuzzy, the needle is a circular arc filled in black, thetwo extreme borders corresponding to the intervalboundaries. If the data is fuzzy (as in Fig. 7), thekernel is represented by a black filled circular arc,and the remaining part of the support is representedby thin needles every 28, to show that the valuecould be one of these values.

As for the norm pattern, it is represented bycolored areas under the needle. It is red outsidethe support (forbidden), the kernel is green(authorised) and all other areas (support—kernel)are orange. The bottom of the gauge displays a shortlabel naming the data represented, and it is writtenon a background the color of which expresses thelevel of compatibility between the data and normpattern. This background is green when the wholeneedle is included in the pattern’s kernel (possibi-lity = 1, necessity = 1), it is red when the wholeneedle is outside the support (possibility = 0, neces-

Figure 8 Different compatibility situations

sity = 0), and it is orange in all other, less clear-cutsituations (possibility > 0).

We can see in Fig. 8 various typical situations. Inthe energy example, the data is fuzzy, but corre-sponds perfectly to the norm pattern, whichexplains the green color at the bottom of the gauge.The case of precise data can be seen in the carbo-hydrate example, with a medium compatibility withthe norm pattern (orange bottom color). In thecalcium example (green band), the norm does nothave an upper limit, nor the data. Nevertheless, thecompatibility is perfect (green bottom color). In thesaturated fat example, the situation is analogous tothe calcium example, but the reverse: the normdoes not have a lower limit, but is totally incompa-tible (red bottom color) despite the great impreci-sion in data (very wide needle). The last exampleshows completely unknown data, displayed as a ‘?’and a needle covering the whole domain.

The calcium example is again considered in Fig. 8.It is a situation which arises often in our softwareapplications, when cumulating quantities such ascalcium values, and where one of the values isunknown. It leads to data where we have a lowerboundary value, but no upper boundary value. Wecan see in this example that it is still possible to get aprecise evaluation of the compatibility with a normpattern.

3.2. Textual representations for fuzzyquantities

Our nutritionists have the responsibility of providingnutrient values for each food, using custom data-base applications. They do so using a written lan-guage for fuzzy numbers which has been designed incoordination with them, andwhich has the followingproperties:

� it

be

is an extension of the usual form for decimalreal numbers and for some simple algebraicforms;

� it

is simple to read and write; and � it allows them to express precise values, inter-

vals, fuzzy numbers, upper and lower bounds,possibly fuzzy.

tween several data and norm patterns.

Page 8: Nutri-Educ, a nutrition software application for balancing ...diabeto.enseeiht.fr/download/publis/aim_march2008.pdf · 2.3.1. Fuzzy quantities A fuzzy quantity (or fuzzy interval

J.-C. Buisson

Certainly, such a written language already existsin the form of 4-uples for trapezoidal fuzzy num-bers. The nutritionists in our group are somehowable to deal with them, but it is not easy enough forthem. It takes time to interpret, which is a strongflaw in applications where people must sometimeslook at large arrays of data. An alternative writtenrepresentation for nutritionist is therefore neces-sary.

As in all such written representation system,we must clearly summarize the domain of therepresented objects and the domain of theirrepresentations, and how the procedures forreading and printing, which allow the conver-sions from one domain to another (Fig. 9),work.

Only positive fuzzy numbers are represented.Weshall specify the different representation classes,by giving examples and by describing the repre-sented numbers as 4-uples each time. These classesare

3.2.1. Precise numbersThey have the best precision at three decimal fig-ures. For example:

123!ð123:; 123:; 0:; 0:Þ

3.2.2. Ordinary intervalsTheir written form has been simplified, since stan-dard mathematical notation was declared ‘toodifficult to write’ by nutritionists. The followingexamples are better than a formal description:

12:2� 25:1!ð12:2; 25:1; 0:; 0:Þ1000� 1200!ð1000:; 1200:; 0:; 0:Þ

220

3.2.2.1. Fuzzy numbers. They have two forms:�val which represents val � 10% and ��val whichrepresents val � 20%. In both cases, the represented

Figure 9 Reader and printer for fuzzy numbers.

number is triangular; the support is the interval[val � val.n/100, val + val.n/100]:

� 200!ð200:; 200:; 20:; 20:Þ� � 3:141592!ð3:14; 3:14; 0:628; 0:628Þ

The forms �0 and ��0 are forbidden, since theywould represent (0., 0., 0., 0.) that is to say 0.,whereas a nutritionist using them would usuallymean the existence of a small quantity. In any caseit would be ambiguous: what difference could bemade between �0 and ��0?

In these situations other forms, which we willdescribe later, can be used.

3.2.3. Intervals with fuzzy boundariesThey are represented by a combination of the pre-vious forms for intervals and fuzzy numbers. Forexample:

� 200� � 400!ð200:0; 400:0; 20:0; 40:0Þ0� � 23:5!ð0:; 23:5; 0:; 2:35Þ

Of course, the forms �val and �val—�val repre-sent the same fuzzy number. The same applies to��val and ��val—��val.

3.2.4. Unbounded valuesWe use the characters ‘<’ and ‘>’, along with theforms used for fuzzy numbers. For example:

> 400!ð400:; 0:;1;1Þ< � 10!ð0:; 0:; 10:0; 1:0Þ

To represent a value close to 0, we can use a formsuch as <�0.01.

In addition, we use ‘?’ to represent the comple-tely unknown value (0., 0., 1, 1).

3.2.5. Reader and printerThe reader’s procedure is simple: from a writtenform such as �100, a simple lexical and syntacticalanalysis is performed in order to reconstruct theassociated 4-uple (100.0, 100.0, 10.0, 10.0).

The printer’s procedure is more complicated: fora given 4-uple, it has to find the written form whichrepresents it exactly when it exists; if it does notexist, it must find the ‘‘simplest’’ form which repre-sents it ‘‘best’’. For example, (200.0, 400.0, 20.0,40.0) is represented exactly by �200—�400, butwhat about the 4-uple (200.0, 400.0, 20.0, 10.0)?The representation �200—�400 does not do justiceto the higher precision of the right border, and theform �200—400 makes it appear more precise thanit is in reality.

Concretely, we must try to represent the left andright borders of the value separately.

When the left border is 0 or when the right borderis infinity, we will write the number ‘<’ or ‘>’; when

Page 9: Nutri-Educ, a nutrition software application for balancing ...diabeto.enseeiht.fr/download/publis/aim_march2008.pdf · 2.3.1. Fuzzy quantities A fuzzy quantity (or fuzzy interval

Nutri-Educ software application to balance and improve meals 221

the forms for the left and right borders are identical,we will write ‘�val’ or ‘�val’; otherwise we will usethe general interval form.

In order to find the written form for the left (resp.right) border (non zero nor infinite), wemeasure theproportion of the left (resp. right) spread relative tothe left (resp. right) border of the kernel. When thisproportion is over 20%, we use the form ‘�val’where val is the left (resp. right) border of thekernel; when it is over 10% we use the form ‘�val’;val’; otherwise (less than 5%) we use the ordinarydecimal form for real numbers.

For example, from the 4-uple (3.144, 3.141,0.314, 0.35) we can see that the left and rightborders of the support are neither zero nor infinite.By studying the two numbers characterizing the leftborder, we see that the spread 0.314 is exactly 10%of the left border of the kernel 2.144; it is thereforerepresented by �3.14 (rounded to three figures).For the right border, the spread 0.35 is between 10%and 20% of the value 3.141; it is therefore repre-sented by �3.14 (rounded to three figures). Sincethe forms for the left and right borders are identical,the final form for (3.144, 3.141, 0.314, 0.35) is�3.14.

3.3. Prototype pictures

We use pictures of dish portions to describe mealcomposition. An example of such pictures is given inFig. 10.

All of these pictures were taken by our group ofdieticians; up till now, a database of 640 pictures hasbeen compiled, for the most common foods anddishes. We are continuing to take more picturesfor less common foods.

A first problem to solve with the medical teamdealt with standardizing the dish’s appearance. Itwas decided that they would all be presented onthe same ordinary white plate, with a fork and aknife on either side, in order to give a scale for thefood’s and dish’s sizes. Of course, it could beobjected that the fork and the knife should alsobe put into perspective since several differentplates, knives and forks of different absolute sizescould give the same image. But in fact, the food on

Figure 10 Examples of pi

the plate will serve as a size reference for theplate, while the plate serves as a reference for thefood. This circularity is only apparent since thesemutual definitions will really operate only afterthe presentation of several different images. Forexample, pictures showing foods with very con-strained sizes (biscuits, yoghurts, etc.) will quicklydisambiguate the sizes of the plate, knife and forkfor the user.

A second point to take care of was the numberof different portions to photograph for each dish.Naturally, most users use three different adjec-tives ‘small’, ‘medium’ and ‘large’ when talkingabout the sizes of dish portions. This was con-firmed in our study of the terms spontaneouslyused by children in food surveys. Sometimes theseterms are modulated, like in ‘very small’ or ‘fairlylarge’, but there still remain three main cate-gories. This partitioning created by language isself-maintaining: when we are faced with a por-tion picture, the fact that we classified them in thepast as ‘small’, ‘medium’ or ‘large’ distorted ourperception in order for it to be assimilated intoone of these three categories, even if it requiressome qualification to do it (with ‘very’, ‘almost’,etc.). Therefore language helped to form percep-tion, and perception created language. For exam-ple, when faced with a large plate of sauerkraut,we see in addition all large plates of sauerkraut wewere served in our childhood; our current percep-tion is therefore distorted by these memories, andthe perceived quantity will be an average, orrather the most typical quantity of all similarportions from our memories.

3.4. Associated possibility distributions

Therefore the three pictures ‘small’, ‘medium’ and‘large’ are prototypes; they correspond to whatmost people think of as small, medium or largeportions for themselves. Next we had to decidewhat possibility distributions were associated totheir weights.

Since we have seen in the previous section thateach of these prototype pictures represented thelargest class of all analogous portions that most

ctures of dish portions.

Page 10: Nutri-Educ, a nutrition software application for balancing ...diabeto.enseeiht.fr/download/publis/aim_march2008.pdf · 2.3.1. Fuzzy quantities A fuzzy quantity (or fuzzy interval

222 J.-C. Buisson

Figure 11 Weight values assigned to pictures. The possibility distributions overlap, because each picture representsthe largest class of all portions to which it is assimilated.

people assimilated to it, the nutritionists had toassign large enough fuzzy weights (using the writtenlanguage described previously) to them. We repre-sented the values associated with the three picturesin Fig. 11 as examples.

3.5. Towards a fuzzy food values database

For the last 5 years the dieticians of the Nutri-Educproject have been using the description languagepresented in Section 3.2 to build a French foodvalues database with fuzzy data. At the same timethey are building a picture database with fuzzyquantities as described in Sections 3.3 and 3.4.When this huge work is completed, it will be inter-ested to study how such a database improves theapplication software performance and user experi-ence.

4. A heuristic search algorithm forbalancing meals

4.1. Description of the problem

Nutri-Educ helps the user to build a meal which isadapted to their needs. By ‘adapted to theirneeds’, (‘well-balanced’ for short) we mean itconforms to international guidelines for the user,as they have been defined previously. We have alsoseen how to assess this global compatibility of themeal to these norms as a single necessity degree,which takes into account the individual compat-ibility degrees for each nutrient, and the relativeimportance of these individual degrees in theglobal assessment.

Thus, Nutri-Educ evaluates meals which belong tothe state space of all possible meals which can becomposed with a given set of foods or dishes. Thisset of components is finite: it is the set of all thefoods and dishes the user chose to use, plus a smallnumber of missing foods that the program may addto it in order to make it well-balanced. In addition,each of the foods or dishes used cannot be present ina meal in a continuous infinite quantity: they areserved as one plate or half a plate in in-company orin-school cafeterias, and they are used as usualportions (spoonfuls, etc.) for foods and dishesserved at home.

In fact, what links a meal to similar meals in thisspace is an action of addition, deletion or portionmodification. These actions are both mathematicaltransformations on a quantity vector and nutrition-ally acceptable operations on a meal. Thus, thisstate space can now be seen as a graph, nodesbeing possible meals and vertices between nodesbeing nutritionally acceptable actions of transfor-mation from one meal to another. In the case offixing a meal in Nutri-Educ, the goal is to find onthis graph a node which corresponds to a well-balanced meal, and which is ‘as close to the initialmeal as possible’ (to be defined) in the graph. Thepath in the graph from the initial meal to thebalanced meal gives the sequence of actions toapply in order to fix the initial meal and make itwell balanced immediately. An example of such agraph is given on Fig. 12.

In summary, the search graph is finite, made upof meals composed of a bounded number of por-tions of foods or dishes. It usually is quite large,usually with thousands of nodes for a dozen foodcomponents. The goal is to find on this graph a

Page 11: Nutri-Educ, a nutrition software application for balancing ...diabeto.enseeiht.fr/download/publis/aim_march2008.pdf · 2.3.1. Fuzzy quantities A fuzzy quantity (or fuzzy interval

Nutri-Educ software application to balance and improve meals 223

Figure 12 A small part of the graph of possible meals. Foods are used by portion, and not on a continuous scale.

well-balanced meal for the user, starting from theinitial meal provided by the user. Efficient tech-niques must be used, since our algorithms will onlyhave a few seconds to find the goal meal in thissearch space.

Figure 13 The g is the actual cost from r tom, and h is the esa path are cumulated additively.

4.2. General principles of heuristicsearches

This algorithm does not take into account thecost of an action of a change of state, which

timated cost fromm to the nearest goal state. The costs on

Page 12: Nutri-Educ, a nutrition software application for balancing ...diabeto.enseeiht.fr/download/publis/aim_march2008.pdf · 2.3.1. Fuzzy quantities A fuzzy quantity (or fuzzy interval

224 J.-C. Buisson

measures to which degree this change is difficultto perform, with higher costs for adding or repla-cing foods (since it leads to changing the recipe,doing some shopping, etc.) than for changingquantities. This minimum cost of transformationfrom the initial meal r to a meal m is usuallynoted g.

Now we need to use a function to estimatethe remaining cost from a current state m tothe nearest goal state. This term is heuristic,since if it were certain and exact, it wouldmean that the search results would already beknown! In reality, it is an estimation that willhelp choose the nodes to develop. This term isusually noted h, and is called the heuristic term.Fig. 13 illustrates the utility of these two terms gand h.

With these two tools, we can even attempt amore ambitious goal. We can now look for, not only agoal state (a balanced meal), but also a goal statefor which the cost of the path from r to it is minimal.In other words, it corresponds to finding the leastexpensive modifications that will make the mealwell-balanced. The basic algorithm for an orderedheuristic search is as follows:

repeata-‘‘generated meals’’ initial meal, ‘‘visitedmeals 1, g(r) 0b-focal meals x of ‘‘generated meals’’ which have thesmallest g(x) + h(x)c-when one of the meals of focal is well balanced, exitwith successd-choose in focal a meal m to develope-for each successor n of m, compute g(n) = min(g(n),g(m) + cost(m, n)) and put n in ‘‘generated meals’’ with noduplicatesf-add m to ‘‘visited meals’’ with no duplicates, remove mfrom ‘‘generated meals’’until ‘‘generated meals’’ = 1; (exit with failure)

This algorithm is called A algorithm [18]. g(m) isan (over)estimation of the smallest cost of the pathsfrom r to m, and h(m) is an estimation of theremaining path from m to a goal state. f(m)= g(m) + h(m) is then an estimation of the cost ofthe shortest path from r which goes through m; bychoosing at step d- a state which has the smallestf(m), we bet that there is a state for which weestimate that it is crossed by a minimal path from rto a goal state.

The exit on failure only happens when all states ofthis finite graph have been explored, and none is agoal state. In practice we will never wait as long asthat and we will put a limit on the number of cyclesthe algorithm runs. In the more general case where

the graph is infinite, it can be shown [18] that suchan A algorithm:

� d

oes end up and � a lways finds a goal state when one exists.

However, the optimality of this goal state is notguaranteed. If h is a minorant estimation of theremaining cost to the nearest solution, the algo-rithm is called A*, and it can be shown [19] that thefirst solution found has an optimal cost. Moreover, ifwe keep running the algorithm after having foundthe first solution, it will keep producing solutions inincreasing order of cost. This feature is interestingin practice in a program such as Nutri-Educ whichwill provide the user with a whole set of solutions,presenting the best ones first.

The exact description of the functions g and h weused for Nutri-Educ and how fuzzy computationswere dealt with in the algorithm is very technicaland will not be described here; it can be found in[20]. We used a test database of 3479 real meals inorder to assess the quality of various solutions. Ourbest algorithm, which is used in the current versionof Nutri-Educ, is able to balance 3059 of the data-base meals in less than 20,000 cycles.

5. The Nutri-Educ applicationsoftware

Nutri-Educ is a free software application which ispresently available only in French and accessible athttp://nutrieduc.fr (accessed: 29 November 2007);it is not open-source. The Diabetology Departmentof Rangueil University Hospital in Toulouse pre-scribes it to its diabetic patients, as part of theirtreatment and educational training. The software iscomposed of two parts: a client interface written inFlash/actionscript, and a computation server writ-ten in Java. These two parts communicate using theXML-RPC protocol. This organization allows the pro-gram to run as a desktop application or as an Inter-net application. The client interface was designedwith large widgets in order to also be used on kioskswith touch screens. Fig. 14 presents a block diagramof the various phases of the execution of Nutri-Educ.

The ‘identification’ and ‘user file editing’ phasesprovide Nutri-Educ with the necessary data aboutthe user, such as their energy needs and informationon possible medical problems related to nutrition.The ‘daily meal breakdown’ phase allows the pro-gram to compute an energetic goal for a particularmeal (breakfast, snack, lunch or dinner). All othermodules are related to the meal description and itsbalancing assessment and improvement.

Page 13: Nutri-Educ, a nutrition software application for balancing ...diabeto.enseeiht.fr/download/publis/aim_march2008.pdf · 2.3.1. Fuzzy quantities A fuzzy quantity (or fuzzy interval

Nutri-Educ software application to balance and improve meals 225

Figure 14 Block diagram of Nutri-Educ architecture.

The previous version of this application wascalled ‘Nutri-Expert’ and used the same A* algo-rithm to improve meal balance. ‘Nutri-Educ’ is amore sophisticated version where the interfacehas been completely redesigned to take intoaccount fuzziness in a visual way, from food por-tions when querying quantities to meal corrections

Figure 15 Choice between several portions of bread. Each

which are actually drawn on the initial mealdescription, as can be seen in Fig. 17. Ref. [20]provides a very detailed account of the A* algo-rithm whereas the present paper gives a moregeneral presentation of the application and putsmore emphasis on data elicitation and relatedmedical issues.

5.1. Meal description

Fig. 15 shows how a user chooses between severalfood pictures. Theweight of each one is representedby a possibility distribution which was given by adietician.

5.2. Fuzzy computation and assessment

Fig. 16 shows a meal which has been providedby the user, and how it corresponds to thenorms regarding four nutrients (energy, carbohy-drates, fat, proteins). The score (7/10) is thecombination of the compatibility measures asso-ciated with each nutrient, weighted by theirimportance.

5.3. Meal balancing

The user may then press the ‘balance’ (‘correc-tion’) button, which runs the meal balancing

one is associated with a weight possibility distribution.

Page 14: Nutri-Educ, a nutrition software application for balancing ...diabeto.enseeiht.fr/download/publis/aim_march2008.pdf · 2.3.1. Fuzzy quantities A fuzzy quantity (or fuzzy interval

226 J.-C. Buisson

Figure 16 Fuzzy computation and norm assessment with gauges.

Figure 17 Meal balancing.

A* algorithm. The recommended solutions arepresented both in text (right column) andvisually, using food pictures as much as possible(Fig. 17).

6. Conclusion

Fuzzy arithmetic is well suited to deal with theinherent imprecision of data associated with food

Page 15: Nutri-Educ, a nutrition software application for balancing ...diabeto.enseeiht.fr/download/publis/aim_march2008.pdf · 2.3.1. Fuzzy quantities A fuzzy quantity (or fuzzy interval

Nutri-Educ software application to balance and improve meals 227

weights and nutrient values, and to propagate itthrough computations in a mathematically soundway. Fuzzy pattern matching allows the compatibil-ity of data to a norm to be assessed and all users canunderstand the results with our extended gauges.Finally, a heuristic search algorithm permits the userto transform a meal to make it well balanced, acomplex qualitative and quantitative problemwhich is successfully solved in 87% of cases.

For the future, we intend to deal with all themeals of a day and try to balance them globallyinstead of balancing them individually, which wouldmake better sense from a nutritional point of view(at least for users who are not diabetics). It wouldgreatly enlarge the state space of the heuristicsearch algorithm, but would also reduce the con-straints on the problem and facilitate its solution.We also need to make medical evaluation studies toassert to which extent using Nutri-Educ improvesthe user’s knowledge and modifies their daily cook-ing habits. If proved, its efficacy would make Nutri-Educ an easy to use and powerful tool for helping tofight the worldwide obesity problem.

References

[1] Buisson J-C. Approximate reasoning in computer-aidedmed-ical decision systems. In: Zimmermann H-J, editor. Thehandbooks of fuzzy sets practical applications of fuzzytechnologies, 7. Boston: Kluwer; 1999. p. 337—61.

[2] Wirsam B, Hahn A. Fuzzy methods in nutrition planning andeducation in clinical nutrition. In: Theodorescu, Jain, Kan-del, editors. Soft computing in human-related sciences.Boca Raton: CRC Press; 1999. p. 335—50.

[3] Zadeh LA. Fuzzy sets. Information and Control 1965;8:338—53.

[4] Moore RE. Interval analysis. New York: Prentice Hall; 1966.[5] Dubois D, Prade H. Theorie des Possibilites–—Application a la

representation de connaissances en informatique. Paris:Masson; 1988.

[6] Terano T, Asai K, Sugeno M, editors. M.S. applied fuzzysystems. Cambridge: AP Professional; 1995.

[7] Dubois D, Prade H. Fuzzy numbers: an overview. In: BezdekJ, editor. Analysis of fuzzy information. Boca Raton: CRCPress; 1987. p. 3—39.

[8] Buisson J-C. Knowledge development expert systems andtheir application in nutrition. In: Leondes C, editor. Knowl-edge based systems–—techniques and applications. NewYork: Academic Press; 2000. p. 37—65.

[9] Turnin M-C, Beddok R, Clottes J, Abadie R, Martini P, BuissonJ-C, et al. Telematic expert system diabeto: a new tool fordiet self monitoring for diabetic patients. Diabetes Care1992;15(2):204—12.

[10] Turnin MC, Bourgeois O, Cathelineau G, Leguerrier A, HalimiS, Sandre-Banon D, et al. Multicenter randomized evaluationof a nutritional educational software in obese patients.Diabetes Metab 2001;27:139—47.

[11] USDA-SR16. National nutrient database for standard refer-ence, release 16. Technical report. Washington: UnitedStates Department of Agriculture USDA Nutrient DataLaboratory Home Page. 2003, http://www.ars.usda.gov/ba/bhnrc/ndl (accessed: December 1, 2007).

[12] Pehrsson PR, Haytowitz DB, Holden JM, Perry CR, BecklerDG. USDA’s national food and nutrient analysis program:food sampling. Journal of Food Computing Analysis2000;13:379—89.

[13] USDA. United States Department of Agriculture, dietaryguidelines for Americans. Washington: USDA, Nutrient DataLaboratory Home Page. 2005, http://www.ars.usda.gov/ba/bhnrc/ndl (accessed: December 1, 2007).

[14] Krauss RM, Eckel R, Howard B. American heart associationdietary guidelines. Circulation 2000;102:2284—99.

[15] Schneeman BO. Evolution of dietary guidelines. Journal ofthe American Dietetic Association 2003;103(12):5—9.

[16] Schneeman BO. Trapezoidal approximations of fuzzy num-bers–—revisited. Fuzzy Sets and Systems 2007;158:757—68.

[17] Cayrol M, Farreny H, Prade H. Fuzzy pattern matching.Kybernetes 1982;11:103—16.

[18] Farreny H. Recherche heuristiquement ordonnee dans lesgraphes d’etats. Paris: Masson; 1995.

[19] Hart PE, Nilsson NJ, Raphael B. A formal basis for theheuristic determination of minimum cost paths. IEEE Trans-actions on Systems Science and Cybernetics 1968;4:100—7.

[20] Buisson J-C, Garel A. Balancing meals using fuzzy arithmeticand heuristic search algorithms. IEEE Transactions on FuzzySystems 2003;11:68—78.