30
COVER SHEET This is the author-version of article published as: Lau, Raymond Y.K. and Tang, Maolin and Wong, On and Milliner, Stephen W. and Chen, Yi-Ping Phoebe (2006) An Evolutionary Learning Approach for Adaptive Negotiation Agents. International Journal of Intelligent Systems 21(1):pp. 41-72. Accessed from http://eprints.qut.edu.au Copyright 2006 John Wiley & Sons

COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

COVER SHEET

This is the author-version of article published as: Lau, Raymond Y.K. and Tang, Maolin and Wong, On and Milliner, Stephen W. and Chen, Yi-Ping Phoebe (2006) An Evolutionary Learning Approach for Adaptive Negotiation Agents. International Journal of Intelligent Systems 21(1):pp. 41-72. Accessed from http://eprints.qut.edu.au Copyright 2006 John Wiley & Sons

Page 2: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

An Evolutionary Learning Approach for

Adaptive Negotiation Agents

Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. MillinerCentre for Information Technology InnovationFaculty of Information TechnologyQueensland University of TechnologyGPO Box 2434, Brisbane, Qld 4001, AustraliaE-mail: {r.lau, m.tang, o.wong, s.milliner }@qut.edu.au

Yi-Ping Phoebe ChenSchool of Information TechnologyDeakin University221 Burwood Highway, Burwood, Victoria 3125, AustraliaE-mail: [email protected]

Abstract. Developing effective and efficient negotiation mechanisms for real-world applications such ase-Business is challenging since negotiations in such a context are characterised by combinatorially complexnegotiation spaces, tough deadlines, very limited information about the opponents, and volatile negotia-tor preferences. Accordingly, practical negotiation systems should be empowered by effective learningmechanisms to acquire dynamic domain knowledge from the possibly changing negotiation contexts. Thispaper illustrates our adaptive negotiation agents which are underpinned by robust evolutionary learningmechanisms to deal with complex and dynamic negotiation contexts. Our experimental results show thatGA-based adaptive negotiation agents outperform a theoretically optimal negotiation mechanism whichguarantees Pareto optimal. Our research work opens the door to the development of practical negotiationsystems for real-world applications.

Keywords: Genetic Algorithms, Evolutionary Learning, Adaptive Negotiation Agents.

1. Introduction

Negotiation refers to the process by which group of agents (human or software) commu-nicate with one another in order to reach a mutually acceptable agreement on resourceallocation (distribution) [30]. Indeed, negotiation is one of the key stages with referenceto the Business-to-Business Transaction (BBT) model [19]. If the negotiation spaces arelarge as found in many Business-to-Business (B2B) trading situations where dozens ofissues (i.e., attributes) are involved, even experienced human negotiators will be over-whelmed. Under such circumstance, sub-optimal rather than optimal deals are oftenreached. Consequently the phenomenon of “leaving some money on the table” mayoccur [37]. Software agents are encapsulated computer systems situated in some en-vironments such as the Internet and are capable of flexible, autonomous actions inthat environment to meet their design objectives [21, 48]. The notion of agency canbe applied to build robust architectures for automated negotiation systems within whicha group of software agents communicate and autonomously make negotiation decisionson behalf of their human users. Recent research in intelligent agent mediated electroniccommerce has highlighted the importance and the benefits of agent-based negotiationsupport for e-Business [17, 19]. These negotiation agents can considerably reduce humannegotiation time and identify optimal or near optimal solutions from combinatoriallycomplex negotiation spaces. This paper focuses on one of our negotiation models, a

Page 3: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

Genetic Algorithm (GA) based adaptive negotiation agent model, implemented on ourWeb-based negotiation server.

1.1. Requirements of practical negotiation systems

Practical negotiation mechanisms must be computationally efficient. It implies that ne-gotiation agents should be developed based on the assumption of bounded rather thanperfect rationality [30]. These agents have only limited time and resources to deliberatenegotiation solutions. As indicated by many authors that agents’ knowledge about thenegotiation spaces is very limited in business environments [11, 21]. An agent knows itsown preferences (utility function) but normally does not know the preferences of its oppo-nents because this information is often kept private by individual business. Accordingly,negotiation agents should be designed to deal with incomplete and uncertain informationfound in most negotiation scenarios. In particular, these agents should be empowered byeffective and efficient learning mechanisms to gradually learn the opponents’ interestsbased on their moves in the negotiation processes. Moreover, given the fact that thepreferences of an agent and that of its opponents may change over time because of theever changing negotiation environment, effective negotiation mechanisms must be ableto deal with dynamic negotiation scenarios. For example, a negotiation agent should beable to observe the changing preferences of its opponents and adapt to the opponents’changing behaviour by initiating the corresponding actions (e.g., conceding more or less insubsequent rounds). To support various negotiation scenarios of real-world applications,negotiation agents should be able to act flexibly and mimic a wide spectrum of negotiationattitudes (e.g., from fully self-interested to fully benevolent) rather than limited by a fewpre-defined negotiation attitudes. Finally, real-world negotiations such as negotiationsin e-Business often involve many issues (e.g., price, quantity, product quality, shipmenttime, payment method, etc.), practical negotiation mechanisms must be able to handlemulti-issue negotiations.

1.2. Justifications of the proposed approach

Negotiations in the real-world are often characterised by combinatorially complex ne-gotiation spaces which involve many issues. In addition, negotiators are bounded bylimited computational resources, time, and information about the negotiation spaces.Classical negotiation models based on operational research methods [9, 20] or traditionalgame-theoretic models [34, 47] need to be further developed to support negotiations inthese realistic situations. GAs have long been taken as heuristic search methods to findoptimal or near optimal solutions from large search spaces [16]. GAs can identify feasiblesolutions (e.g., mutually acceptable offers) efficiently without requiring detailed structuralinformation about the search space as input. In the context of automated negotiations,our proposed GA is used to drive the heuristic search over a set of potential negotiationsolutions (i.e., agreements). Given a negotiation situation, each negotiation agent willemploy its own GA to conduct the distributed heuristic search in parallel. Therefore,our proposed GA-based negotiation method is more efficient than other negotiationmethods based on exhaustive search. As a result, our negotiation approach is feasiblefor solving large and complex negotiation problem. In addition, the negotiation agents’searches for the mutually acceptable solution is not conducted in a totally independentmanner. Instead, a co-evolution [7] is actually performed since the intermediate searchresult (i.e., a tentative solution) obtained by an agent will be passed to its negotiation

Page 4: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

opponents to stimulate their evolution processes. Thereby, a group of negotiation agentswork collaboratively to solve complex negotiation problems.

It has been shown that a GA-based negotiation model can lead to optimal negotiationoutcome as it is generated by a game-theoretic model under certain conditions [13]. Inaddition, since GAs are based on the evolution principle of ”natural selection”, theyare effective in modelling dynamic negotiation environments where good negotiationstrategies evolve according to negotiators’ changing preferences. As a matter of fact,GAs have been successfully applied to develop automated negotiation systems [6, 27, 29]before. GA-based negotiation approaches fulfil the general requirements of developingpractical negotiation systems in terms of computational efficiency, bounded agent ratio-nality, and the assumption of limited information about the negotiation spaces. Therefore,a GA-based adaptive negotiation model is proposed to support negotiations for real-worldapplications.

1.3. Contributions of the paper

Although many negotiation models have been reported in the literature [18, 25, 35, 39, 50],these models have limited use in realistic negotiation environments because they are eitherassuming complete knowledge about the negotiation spaces, computationally inefficient,ignoring the time pressure, ineffective in learning changing negotiation contexts, or deal-ing with single issue only. This paper illustrates our GA-based adaptive negotiation agentswhich address all of the above issues. In particular, three main contributions of the workreported in this paper are:

1. Reviewing existing learning approaches for automated negotiations, particularly GAs-based adaptive negotiation mechanisms;

2. Designing and developing GA-based negotiation agents to support real-world appli-cations such as negotiations for e-Business;

3. Quantitatively evaluating the performance of our GA-based adaptive negotiationagents.

1.4. Outline of the paper

The rest of the paper is organised as follows. Section 2 highlights the basic concepts withrespect to a generic negotiation framework; Section 3 illustrates a basic negotiation modelwhich guarantees Pareto optimal; Our GA-based adaptive negotiation mechanism is thenillustrated in Section 4; Section 5 reports our experimental procedures and our empiricalresults;Section 6 compares our method with other related negotiation approaches; Thefinal section highlights possible extensions of current work and concludes with a discussionof our GA-based adaptive negotiation agents.

2. A Generic Framework for Automated Negotiation

Given its ubiquity and importance in many different contexts, research into negotia-tion theories and techniques has attracted attention from multiple disciplines such as

Page 5: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

Distributed Artificial Intelligence (DAI) [24, 26, 46], Social Psychology [3, 36, 37], Op-erational Research [9, 20], and Game Theory [34, 47]. Despite the variety of approachestowards the study of negotiation theory, a negotiation model consists of four main ele-ments: negotiation protocol, negotiation strategies, negotiation environment, and agentenvironment [14, 27, 30].

Negotiation Protocol refers to the set of rules that govern the interactions amongnegotiators. The rules specify the types of participants (e.g., the existence of anegotiation mediator), the valid states (e.g., waiting for bid submission, negotiationclosed), the actions that cause negotiation state changes (e.g., accepting an offer,quit a negotiation session). For example, English Auction is a well-known protocolfor single issue (e.g., price) negotiation in an ascending open-cry environment.

Negotiation Strategies refer to the decision making apparatus that the agents employto act in line with the negotiation protocol, the negotiation environment, and theagent environment to achieve their objectives. For instance, negotiators should setstringent goals initially and concede first on issue of lesser importance to achievehigher payoffs in a competitive environment [3, 37]. A negotiation mechanism refersto a particular negotiation protocol and the corresponding decision making modelfor the formulation of negotiation strategies.

Negotiation Environment refers to factors that are relevant to problem domain. Thesefactors include the number of negotiation issues, number of parties, time constraints,nature of domain (e.g., purely competitive vs. cooperative), etc.

Agent Environment refers to the characteristics of the participants (agents). Thesecharacteristics include an agent’s attitude (e.g., self-interested vs. benevolent), cog-nitive limitations (e.g., omniscient agent vs. memoryless agent), goal setting, initialoffer magnitude, knowledge and experience (e.g., knowledge about the opponents),etc.

This paper focuses on our GA-based adaptive negotiation mechanism. In general,negotiators strive to increase their individual payoffs while ensuring that an agreement isfeasible [27, 39]. In other words, negotiation agents have a common interest to cooperate,but have conflicting interests over exactly how to cooperate [14]. The main problem thatconfronts negotiation agents is to decide how much to concede in order to achieve thetwo conflicting goals of maximising its own payoff while ensuring an agreement to bemade as soon as possible. In short, the most prominent issues that must be addressed ina negotiation mechanism are:

− How to represent negotiators’ preferences and offers;

− How to compute concession and generate an offer;

− How to evaluate an incoming offer;

− How to learn the opponents’ preferences.

Figure 1 depicts a simple negotiation process which involves a buyer and a seller. Inthis paper, it is assumed that only a finite set of agents P participates in a negotia-tion process. The negotiation process can be understood with reference to the simple

Page 6: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

offer (o1)

counter-offer (o2)

offer with concession (o3)

counter-offer (o4)

counter-offer (on)

Accept/Quit by either side

UpB: OpB → R – Utility Function (preference)

HCpB – Hard Constraints

Buyer: pB Seller: pS

UpS: OpS → R – Utility Function

HCpS – Hard Constraints

Figure 1. A Simple Negotiation Process

monotonic concession protocol [38]. Negotiation proceeds in a discrete series of rounds.In each round, each agent puts forward an offer in alternate. If these offers overlap, itmeans that an agreement is reached. If the offers do not overlap, negotiation proceedsto the next round where the agents may make a concession or put forward the sameoffers again. If there is no agreement after the deadline is reached, an agent decides toquit and the negotiation ends with a conflict. With reference to Figure 1, the buyer agentmakes an offer o1 first. Such an offer is driven by her preferences (e.g., the utility functionUo

pB). The seller agent pS ∈ P evaluates o1 according to her own preferences Uo

pS. If the

offer o1 produces a payoff (utility) higher than or equal to the seller’s expected payoffat this round, the seller will accept the offer; otherwise a counter-offer o2 is proposedby the seller. Similarly, the buyer agent pB ∈ P will evaluate o2 according to her ownpreferences. If o2 is not acceptable, the buyer agent may make a concession based onher original offer o1 (e.g., raising the price) and generate a counter-offer o3. This processcontinues until an offer and a counter-offer overlap, or either side decides to quit.

A negotiation space Neg =< P, A,D,U, T > is a 5-tuple which consists of a finite setof negotiation parties (agents) P , a set of attributes (i.e., negotiation issues) A understoodby all the parties p ∈ P , a set of attribute domains D for A, and a set of utility functionsU with each function Uo

p ∈ U for an agent p ∈ P . An attribute domain is denoted Dai

where Dai ∈ D and ai ∈ A. An utility function pertaining to an agent p is defined by:Uo

p : Da1 ×Da2 × . . . ×Dan 7→ R, where R is the set of real numbers. Each agent p hasa deadline tdp ∈ T . It is assumed that information about P,A, D is exchanged amongthe negotiation parties during the ontology sharing stage before negotiation actuallytakes place. A multi-lateral negotiation situation can be modelled as many one-to-onebi-lateral negotiations where an agent p maintains a separate negotiation dialog with eachopponent. In a negotiation round, the agent will make an offer to each of its opponentin turn, and consider the most favourable counter-offer from among the set of incomingoffers according to its own payoff function Uo

p .

Page 7: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

3. A Basic Negotiation Model

The basic negotiation model illustrated in this section is based on multi-attribute utilitytheory (MAUT) [23] and is first discussed in [2]. It is a variant of the monotonic concessionprotocol with Zeuthen strategy [38, 51]. This simple model can guarantee Pareto optimalif an agreement zone exists in a negotiation space. Therefore, we use it as a baselinemodel to evaluate the performance of our GA-based negotiation model.

3.1. Representing offers

An offer −→o =< da1 , da2 , . . . , dan > is a tuple of attribute values (intervals) pertainingto a finite set of attributes A = {a1, a2, . . . , an}. An offer can also be viewed as a vectorof attribute values in a geometric negotiation space with each dimension representinga negotiation issue. Each attribute ai takes its value from the corresponding domainDai . Generally speaking, a finite set of candidate offers Op acceptable to an agent p (i.e.,satisfying its hard constraints) is constructed via the Cartesian product Da1×Da2×· · ·×Dan . As human agents tend to specify their preferences in terms of a range of values,a more general representation of an offer is a tuple of attribute value intervals such as−→o i =< 20− 30(K), 1− 2(years), 10− 30(days), 100− 500(units) >.

3.2. Representing negotiation preferences

The valuations of individual attributes and attribute values (intervals) are defined by thevaluation functions UA

p : A 7→ [0, 1] and UDap : Da 7→ [0, 1] respectively, whereas UA

p is anagent p’s valuation function for each attribute a ∈ A, and UDa

p is an agent p’s valuationfunction for each attribute value da ∈ Da. In addition, the valuations of attributes areassumed normalised, that is,

∑a∈A UA

p (a) = 1. One common way to quantify an agent’spreference (i.e., the utility function Uo

p ) for an offer o is by a linear aggregation of thevaluations [2, 22, 40]:

Uop (o) =

∑a∈A

UAp (a)× UDa

p (da)

Non-linear utility function may also be used in ranking offers [6]. For example, if anagent p’s valuations for the attributes are: UA

p (price) = 0.9 and UAp (quantity) = 0.1,

and its valuations for the attribute intervals are: UDap (20 − 30K) = 0.8 and UDa

p (100 −200units) = 0.5, then an offer −→o i =< 20 − 30(K), 100 − 200(units) > has utility 0.77(i.e., Uo

p (oi) = 0.9× 0.8 + 0.1× 0.5 = 0.77) from the agent p’s perspective.

3.3. Computing concessions and generating offers

If an agent’s initial proposal is rejected by its opponent, it needs to propose an alternativeoffer with the least utility decrement (i.e., computing a concession). An agent will main-tain a set O

′p which contains the offers it has proposed before. In a negotiation round, a

new offer with concession onew can be determined based on the total order ox �p onew,where ox �p onew denotes that the new offer onew is more preferable than an arbitraryoffer ox ∈ {Op−O

′p}. The term {Op−O

′p} represents the difference of the sets Op (the set

of feasible offers pertaining to an agent p) and O′p (the set of offers proposed by an agent

p before). The preference relation �p is a total ordering induced by the utility function

Page 8: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

Uop described in Section 3.2 over a set of offers. In other words, the set of feasible offers

of an agent p are ranked in descending order of utility driven by (�p, {Op−O′p}). A new

offer with concession is picked up from the top of the ranking in each negotiation round.

3.4. Evaluating incoming offers

When an incoming offer o is received from an opponent, an agent p first evaluates if o ∈ Op

is true (i.e., the offer satisfying all its hard constraints). To do this, an equivalent offer o'should be computed. o' represents agent p’s interpretation about the opponent’s proposalo. Once o' for o is computed, acceptance of the incoming offer o can be determined withrespect to p’s own preference (�p, Op). An offer o' ∈ Op is equivalent to o iff everyattribute interval of o' intersects each corresponding attribute interval of o. Formally,any two attribute intervals dx, dy intersect if the intersection of the corresponding sets ofpoints is not empty (i.e., {dx} ∩ {dy} 6= ∅). The acceptance criteria for an incoming offero (i.e., the equivalent o') is defined by:

1. If ∀ox∈Op ox �p o', an agent p should accept o since it produces the maximal payoff.

2. If o' ∈ O′p is true, an agent p should accept o because o' is one of its previously

proposed offers O′p.

It has been proved that if each participating agent p ∈ P employs their preferenceordering (�p, Op) to compute concessions and uses the offer acceptability criteria de-scribed above to evaluate incoming offers, Pareto optimal is always found if it existsin a negotiation space [2]. The advantage of such a basic negotiation model is thatit does not require a pre-defined negotiation threshold nor the information about theopponents’ utility functions. However, one of the problems of the basic negotiation modelis that it may take a long time to sequentially evaluate all the candidate offers before aPareto optimal solution is found. Given a high dimensional multi-issue negotiation spacenormally found real-world negotiation situations, this kind of blind sequential search maynot be feasible. Another problem of the basic model is that it assumes the preferencesof the agents remaining unchanged during a negotiation process, and therefore a Paretooptimal solution can be identified a priori. Nevertheless, agents’ preferences (i.e., theutility functions) often change over time in real-world business environment. Therefore,the Pareto optimal solution identified by the model may not even be an acceptablesolution because it is derived based on inaccurate and out-dated information.

4. GA-Based Adaptive Negotiation Agents

Development of our GA-based adaptive negotiation agents is driven by the basic intu-ition that negotiators tend to maximise their individual payoffs while ensuring that anagreement is reached [11, 27, 39]. This intuition is taken as the basis to develop the high-level negotiation strategy of our GA-based adaptive negotiation agents. The advantageof such an approach is that it does not rely on the detailed assumptions for a particularkind of negotiation environment, and so it is general enough to deal with a variety ofnegotiation situations. On the other hand, low-level negotiation strategy (also calledtactics [31, 32]) is developed according to our proposed genetic algorithm. The low-levelnegotiation strategy of an agent determines which offer to be made in each negotiation

Page 9: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

round, and such a strategy is adaptive with respect to the opponents’ possibly changingnegotiation behaviour. To make an agent’s low-level negotiation strategy adaptive, aneffective learning mechanism, which can take into account the agent’s own payoff as well asthe opponents’ payoffs (to increase the chance of making an agreement), is required. Thelearning mechanisms of our negotiation agents are underpinned by a genetic algorithm.The proposed GA ensures that negotiation agents conducted their heuristic searches (i.e.,the negotiation processes) in an effective and efficient manner.

distance

feasible offers

distance

x (price)

y (qty)

eOr

maxOr

cOr aO

r bOr

dOr

opponentOr

Figure 2. A Geometric Negotiation Space

To estimate the opponents’ payoffs, an agent needs to learn the preferences of itsopponents through their previous encounters. There are two possible approaches as re-ported in the literature; one is to learn the opponent’s utility function directly [6, 50],and the second approach is to estimate an opponent’s preferences indirectly accordingto its previous counter-offers [11, 52]. We adopt the latter approach since it is simplerand computationally tractable. If the first approach is adopted, it is quite difficult, if nottotally impossible, to assign prior probabilities to the set of possible outcomes given thatwe are dealing with practical negotiation problems involving many negotiation issues [11].On the other hand, an offer can be considered as a whole or the individual issues of anoffer can be considered separately when a concession is computed [14]. Our proposedconcession generation mechanism considers an offer as a whole since we believe that thisis a more efficient approach than conceding on an issue by issue basis.

Essentially, each agent is empowered by the proposed GA to evaluate the set of feasibleoffers pertaining to its negotiation space. The fittest member according to the fitnessfunction applied in the GA is chosen from a population (i.e., a subset of the set offeasible offers Op) to form a negotiation solution in a negotiation round. Accordingto the basic intuition, an offer is considered fit if it tends to generate maximal payoff(i.e., close to the maximal offer) and it is also similar to the opponent’s recent counter-

Page 10: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

offer. Evaluation of offers can be analysed with respect to a geometric negotiation space(Figure 2) pertaining to an agent. In Figure 2, offers oa . . . oe are the subset of feasi-ble offers under consideration. Since an agent knows its own utility function, an offeromax representing the offer with the maximal payoff can be identified. In addition, anoffer oopponent represents the opponent’s most recent counter-offer. All these offers arerepresented by the corresponding offer vectors in the geometric negotiation space. Thedistance among the vectors can then be measured based on standard distance functionsuch as the weighted Euclidean distance [8]. In general, the fitter offers of a populationare those having minimal distances to both −→o max and −→o opponent. In each negotiationround, the offer vector −→o opponent may change, and so are the offers considered fit bythe agent. In other words, an agent is learning the opponent’s preferences gradually.According to the evolution principle, the population of feasible offers is dominated byfit members gradually. Therefore, an agent’s proposal is moved closer to its opponentin each negotiation round. Finally, the conflict between an agent and its opponent isresolved and an agreement is reached. Another advantage of the GA-based negotiationmodel is that it does not assume that the preferences of the agents remain unchanged.In fact, the preferences of the agents may change in each negotiation round and thesechanges are reflected by the new offer vectors −→o max and −→o opponent. The set of feasibleoffers are then evaluated with respect to these new reference vectors (i.e., the agents’ newpreferences).

4.1. Offer encoding

The proposed GA-based negotiation model utilises a population of chromosomes to repre-sent a set of feasible offers Ofeas

p ⊆ Op for an agent p. Each chromosome consists of a fixednumber of fields. The first field uniquely identifies a chromosome, and the second field isused to hold the fitness value of the chromosome. The other fields (genes) represent theattribute values of a candidate offer. Figure 3 depicts the decimal encoding (Genotype)of some chromosomes and a crossover operation. The genetic operators such as crossoverand mutation are only applied to the genes representing the attribute values of an offer.

4.2. Fitness function

The top t chromosomes (candidate offers) with the highest fitness are selected from apopulation to build the solution set S. If the size of S is 1, it means that the fittestchromosome from a population is chosen as an offer. In general, a stochastic selectionfunction is applied to the solution set to choose a member as the solution (i.e., the currentoffer) in a particular negotiation round. Ideally, a fitness function should reflect thejoint payoff of each candidate offer. Unfortunately, the utility functions of the opponentsare normally not available for E-business applications. Therefore, the proposed fitnessfunction approximates the ideal function and captures two important issues, an agent’sown payoff and the opponent’s partial preference (e.g., the most recent counter offer).With reference to the geometric negotiation space depicted in Figure 2, the fitness of achromosome (i.e., an offer o) is defined by:

fitness(o) = α×Uo

p (o)Uo

p (omax)+ (1− α)× (1− dist(−→o ,−→o opponent)

MaxDist(|A|)) (1)

where omax represents an offer which produces the maximal payoff based on an agent’scurrent utility function; −→o opponent is the offer vector representing the most recent counter-

Page 11: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

3000-5000

Offe

r ID

Fitn

ess 2000-5000 1-10……..

price QTY shipment

3000-5000

Offe

r ID

Fitn

ess 1000-2000 10-30……..

price QTY shipment

Offer o1 (parent)

2000-4000O

ffer I

D

Fitn

ess 1000-2000 10-30……..

Genes

price QTY shipment

Offer o2 (parent)

Offer o3 (child) Crossover Point A Crossover Point B

Figure 3. Encoding Candidate Offers

offer proposed by an opponent. The parameter α ∈ [0, 1] is the trade-off factor to controlthe relative importance of optimising one’s own payoff or reaching a deal (e.g., by consid-ering the opponent’s recent offer). In other words, α is used to model a wide spectrum ofagent attitudes, from fully self-interested (α = 1) to fully benevolent (α = 0). The termMaxDist(|A|) represents the maximal distance of a geometric negotiation space. It canbe derived from the number of dimensions |A| if each dimension (attribute) is normalisedin the unit interval. In the very first negotiation round, the agents will use MaxDist(|A|)to replace the actual distance dist(−→o ,−→o opponent) in Eq.(1) since oopponent is unknown atthis stage. This is a conservative estimation and assuming that the agents’ interests arein total conflict at the beginning. In other words, an offer is evaluated based on an agent’sown utility function initially. In Eq.(1), we simply use the ratio of the utility generatedby a potential offer o to the maximal utility instead of computing their Mahalanobisdistance because it is easier to compute the ratio given the utility function of an agent.Nevertheless, it is impossible to compute such a ratio for the opponent because the utilityfunction of the opponent is unknown.

The distance between two offer vectors dist(−→o x,−→o y) is defined according to theweighted Euclidean distance [8]:

dist(−→o x,−→o y) =

√√√√√ |A|∑i=1

wi(dxi − dy

i )2 (2)

where the weight factor wi = UAp (ai) is an agent’s valuation for a particular attribute

ai ∈ A. An offer vector −→o x contains an attribute value dxi along the ith dimension (issue)

in a negotiation space. If an attribute interval instead of a single value is specified foran offer, the mid-point of an attribute interval is first computed. The mid-point value isthen scaled to the unit interval [0, 1] by linear scaling:

Page 12: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

dscaledi =

di − dmini

dmaxi − dmin

i

(3)

where the scaled attribute value dscaledi will take on values from the unit interval [0, 1].

dmini and dmax

i represent the minimal and the maximal values for a domain Dai .For practical negotiations arising in business contexts, time pressure is often an im-

portant factor for concession generation. When the negotiation deadline is close, an agentis more likely to concede in order to make a deal. However, different agents may havedifferent attitudes towards deadlines. An agent may be eager to reach a deal and so it willconcede quickly (Conceder agent). On the other hand, an agent may not give ground easilyduring negotiation (Boulware agent) [37]. Therefore, a time pressure function TP is devel-oped to approximate a wide spectrum of agent attitudes towards time. Our TP functionis similar to the negotiation decision function referred to in the literature [10, 14, 13].

TP (t) = 1− (min(t, tdp)

tdp)

1ep (4)

TP (t) denotes the time pressure given the time t represented by the absolute time or thenumber of negotiation rounds; tdp indicates the deadline for an agent p and it is eitherexpressed as absolute time or the maximum number of rounds allowed. The term ep

denotes the agent p’s eagerness in negotiation [41, 42, 43, 45]. An agent p is Boulwareif 0 < ep < 1 is set; for a conceder agent, ep > 1 is true. If ep = 1 is established, theagent holds Linear attitude towards the deadline. The values of the TP function arewithin the unit interval [0, 1]. When t = 0 is the input, the function returns 1. When thedeadline is due (t = tdp), the time pressure function returns 0. The time pressure factoris incorporated into our GA-based negotiation model by the enhanced fitness function:

fitness(o) = α× TP (t)×Uo

p (o)Uo

p (omax)+ (1− α× TP (t))× (1− dist(−→o ,−→o opponent)

MaxDist(|A|)) (5)

The eagerness factor ep can be chosen by the user or else a system default is assumedbefore a negotiation process begins. The time pressure function is applicable for α > 0.For the extreme case (α = 0), an agent is totally benevolent. Under such circumstance,there is no need to consider time pressure because the agent will give ground completelyat the beginning of a negotiation session.

To consider the opponent’s concession matching behaviour [12, 27] and to make ournegotiation system more robust, the eagerness factor [41, 42, 43, 45] can be adjustedby our system automatically. The adjustment of the eagerness factor depends on theopponent’s concession matching behaviour. To support this dynamic concession matchingbehaviour, an agent needs to maintain a separate population of chromosomes to adaptto the concession behaviour of each of its opponent. In general, our GA-based adaptivenegotiation agent tries to mimic its opponent’s concession matching behaviour. If theopponent did not concede in the past few rounds defined by a time window λ, an agentwould become Boulware too. On the other hand, if the opponent conceded, the agentwould become cooperative and start to concede too. The opponent’s concession rateCRopp(λ) is measured by:

Page 13: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

CRopp(λ) =Uo

p (ot−1opponent)−Avgλ+1

i=2 Uop (ot−i

opponent)

Uop (ot−1

opponent)(6)

where Uop (ot−1

opponent) denotes the payoff of the previous offer (t−1 step ago) received fromthe opponent and it is evaluated using agent p’s utility function. Avgλ+1

i=2 Uop (ot−i

opponent)represents the average payoff of the past λ offers received from the opponent. If theopponent did not concede at all, CRopp(λ) is less than or equal zero; otherwise it isgreater than zero. If CRopp(λ) = 0 is true, ep = eMinBoulware

p will be set. For ourcurrent implementation, eMinBoulware

p = 0.9. The maximum emaxp and the minimum

eminp eagerness factors are defined to be 50 and 0.02 respectively. If CRopp(λ) is less

than zero (i.e., a very tough opponent), ep = eminp will be set. If CRopp(λ) is greater

than zero (i.e., the opponent has conceded), our system adjusts the agent p’s eagernessfactor [41, 42, 43, 45] ep by:

ep =

{(CRopp(λ) + 1)×Adjconceder If ep ≤ emax

p

emaxp otherwise (7)

Adjconceder is the adjustment factor to convert the positive CRopp(λ) values to theeagerness values.

4.3. The Genetic Algorithm

An agent’s adaptive negotiation strategy is developed based on the following geneticalgorithm:

LET i = 0

CREATE the first population P i which consists of omax and N−1 individuals randomlyselected from the set Op = Da1 ×Da2 × · · · ×Dan;

WHILE (i ≤ MaxE)P i+1 = Best(P i);MP = Selection(P i);DO UNTIL Size(P i+1) = N

I1 = Crossover(MP );I2 = Mutation(MP );I3 = Cloning(MP );P i+1 = P i+1 ∪ I1 ∪ I2 ∪ I3;

END UNTILi = i + 1

END WHILE

The initial population P 0 is created by incorporating the member omax that maximisesan agent p’s payoff in the first round, and by randomly selecting the N−1 members fromthe candidate set Op, where N is the pre-defined population size. At the beginning ofeach evolution cycle, the fitness value of each chromosome is computed based on themost current negotiation parameters (e.g., an agent’s utility function and the opponent’scounter-offer). Elitism is incorporated by executing the Best function to copy the e%

Page 14: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

fittest chromosomes from the current generation P i to the new generation P i+1. Byexecuting the Selection function, either Tournament selection [32] or Roulette-wheelselection can be used to create a mating pool MP .

For tournament selection, a group of k members are selected from a population toform a tournament. The member with the highest fitness among the selected k membersis placed in the mating pool. This procedure is repeated n times until the mating pool isfull. Roulette-wheel selection is analogous to a roulette wheel where the probability thata member is chosen is proportional to its fitness. The roulette-wheel selection functionworks on the ranks of chromosomes rather than evaluating the fitness values of thechromosomes directly. The reason is that a ranking reflects the relative performance ofthe chromosomes in a population and hence minimises the effect of large disparities inthe fitness values. A ranking pertaining to a population of chromosomes is generated indescending order of the fitness values. For each chromosome o, the probability of selectionis computed according to:

Pr(oselected) = 1− 1orank

× (orank − 1) (8)

where orank is the rank of a particular chromosome (i.e., a potential offer). Then, arandom number in the unit interval is generated for each chromosome. If Pr(oselected)of a chromosome is greater than or equal to the random number, the correspondingchromosome is selected and put into the mating pool. This process continues until themating pool is full.

Standard genetic operators: cloning, crossover, mutation are applied to the matingpool to create new members according to pre-defined probabilities. These operationscontinue until the new generation of size N is created. Two point crossover is used toexchange the fields (offer values) between two parents to create two new members. Anexample of a crossover operation is depicted in Figure 3. Mutation involves randomlyreplacing some attribute values encoded on a chromosome by other attribute valuesfrom the corresponding attribute domains (quantity). Figure 4 shows an example of anattribute domain defined for a buyer agent via the client interface. Therefore, the proposedmutation operation will not generate offer values which are not acceptable to an agent.An evolution process is invoked after every x negotiation round, where x is the evolutionfrequency defined by the user. The lower the value of x, the more GA-based learningtakes place in a negotiation agent. For example, if x = 1 is true, an agent will apply theGA to find a tentative offer in each negotiation round. In general, a low value of x leadsto the reduction of negotiation rounds to reach an agreement (if a solution does exist)because an agent will learn the opponent’s preferences more frequently. There is anotherparameter (MaxE) to define the maximum number of evolutions to be performed in aevolution cycle. An example of genetic parameters is depicted in Figure 5.

As the proposed adaptive negotiation mechanism allows an agent to change its valu-ation functions (i.e., preferences) during a negotiation process, it is possible for an agentto propose the same offer twice in different negotiation rounds. To avoid an endless loop,a maximum number of negotiation rounds (i.e., a deadline) is specified by the user. If thedeadline is due or an agreement is reached, the negotiation process will be terminated.Basically an agent’s concession generation mechanism is underpinned by the above geneticalgorithm because the GA determines which offer to be proposed in the current round.After an offer is determined by the GA, the agent’s decision for an incoming counter-offercan also be developed easily. If the incoming counter-offer produces a payoff greater than

Page 15: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

Figure 4. A Buyer Client Interface

Figure 5. An Example of Genetic Parameters

or equal to that of the current proposal, a rational agent should accept the incomingoffer; otherwise the incoming counter-offer should be rejected. Our negotiation agentsdo not explicitly adjust the aspiration level as proposed by [44]. In this context, theaspiration level means the gap (i.e., the difference of two utilities) between the currentproposal and the incoming offer [44]. Under tight constraints such as short deadlineand tough opponent, our proposed concession mechanism (e.g., the fitness function) willautomatically lower the agent’s expectation and consider a moderate offer (in terms ofperceived payoff) as the current solution. In other words, the aspiration level is lowered.

Page 16: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

5. The Experiments

The negotiation spaces Neg of our experiments were characterized by bilateral negotia-tions between a buyer agent pB and a seller agent pS . Each negotiation profile consists of5 attributes with each attribute domain containing 5 discrete values represented by thenatural numbers Da = {1, 2, . . . , 5}. The valuation of an attribute or a discrete attributevalue was in the interval of (0, 1]. For each negotiation case, an agreement zone alwaysexists since the difference between a buyer and a seller only lies on their valuationsagainst the same set of negotiation issues (e.g., attributes and attribute values). For eachagent, the size of the candidate offer set Op is 3, 125. 5 negotiation groups with eachgroup containing 10 cases were constructed. For each negotiation case in a group, thenegotiation profile (e.g., the preferences) of a buyer was randomly created according tothe configuration mentioned before (i.e., 5 attributes with each attribute containing avalue dai ∈ Dai). A buyer’s profile was then copied to create a seller’s profile, and thenthe seller’s profile was modified with various levels of preferential difference introduced(e.g., modifying the valuations of some attribute values).

For the first simulation group, each negotiation case contained identical buyer/sellerpreferences (i.e., the same weights for the attributes and the same valuations against thesame set of attribute values). This group was used as a control group and the other groupswere the experimental groups. Each negotiation case in the second group contained 20%preferential difference between a buyer and a seller (e.g., one valuation of an attributevalue is different). Each case in the succeeding group was injected a 20% increment ofpreferential difference. The genetic parameters were: population size = 200, mating poolsize = 150, size of solution set = 1, elitism factor = 10%, tournament size = 3, cloning rate= 0.2, crossover rate = 0.6, mutation rate = 0.05, Number of evolutions per cycle = 3,and evolution frequency = 1 (i.e., one evolution cycle per negotiation round). Tournamentselection was used for the experiments reported in this paper. The negotiation trade-offfactor α = 0.5 and the negotiation deadline = 100 (rounds) were set for both the buyerand the seller. One hundred rounds were seen as a moderate deadline given the largenegotiation space. For each simulation run, an agent (either the buyer or the seller) wasrandomly chosen to initiate the negotiation process. The basic performance measureswere joint-payoff (JP ), success rate (SR), fairness ratio (FR), and negotiation round(RD). Joint-payoff is the sum of the utilities generated from the agreement as computedaccording to the buyer and the seller’s utility functions respectively. If an agreement wasnot reached after the deadline was due, zero utility was assumed for each agent. Successrate is the percentage of cases with agreements made over the the number of negotiationcases. The fairness ratio represents the degree of equity in the negotiation and is computedby taking the ratio of the seller’s individual payoff to the buyer’s individual payoff [27].

Experiment 1The purpose of the first experiment was to study the general performance of the GA-

based adaptive negotiation agents when compared with the baseline negotiation model(Section 3) which guarantees Pareto optimal. The main hypothesis was that the perfor-mance of the GA-based negotiation agents should be better than that of the baselinenegotiation system under realistic negotiation conditions (e.g., limited negotiation time).The first simulation run involved negotiation agents developed according to the baselinenegotiation model. The second simulation run involved GA-based adaptive negotiationagents. For the GA-based negotiation agents, both the buyer agent and the seller agent

Page 17: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

were Boulware agents with eagerness factor ep = 0.5. In other words, the interactionsof these agents are symmetry. In this experiment, the fitness function Eq.(5) was used.However, the eagerness factor ep = 0.5 remained the same throughout the experiment.The concession matching mechanism Eq.(6) was not activated in the GA-based nego-tiation agents. In other words, the agents were adaptive to each other’s moves and thenegotiation deadline, but they were not responsive to the opponent’s concession matchingbehaviour. All the simulation runs (each case in the 5 negotiation groups) were performedon our negotiation server with the configuration of a single Pentium III 800 MHz CPU and256MB main memory. Both the GA-based negotiation agents and the baseline negotiationsystem dealt with exactly the same set of negotiation cases. The following measures wereused to evaluate the relative performance of the GA-based adaptive negotiation agentsagainst the baseline negotiation system:.

∆utility =JPGA − JPbaseline

JPbaseline× 100%

∆rate =SRGA − SRbaseline

SRbaseline× 100%

∆round =RDGA −RDbaseline

RDbaseline× 100%

For the baseline negotiation system, the joint-utility (JPbaseline) represents the sum ofthe buyer’s payoff and the seller’s payoff obtained at Pareto optimal point. However, itshould be noted that the maximal joint utility of a negotiation session may not necessarilyobtained at the Pareto optimal point. SRbaseline is the average success rate achieved bythe baseline negotiation mechanism depicted in Section 3. If the agents can reach anagreement before the deadline (i.e., 100 negotiation rounds), the negotiation session isconsidered successful. RDbaseline refers to the average number of negotiation rounds con-sumed by the baseline negotiation mechanism. Table I summarizes the average ∆utility,∆rate, and ∆round for each negotiation group. A positive ∆utility indicates that the GA-based negotiation agents are more effective than the baseline negotiation system, and apositive ∆rate shows that the GA-based negotiation agents are more efficient becausethey can finish negotiations on or before deadline. However, a negative ∆round indicatesthat the GA-based adaptive negotiation agents can finish the negotiation processes infewer number of negotiation rounds. If an agreement can be reached, the actual numberof negotiation rounds consumed by the agents is recorded. If an agreement cannot bereached before the deadline, 100 rounds will be consumed by that particular negotiationsession.

An overall results of ∆utility = 10.3%, ∆rate = 15.8% and ∆round = −18.2% wereobtained and the original hypothesis was confirmed. On average the baseline modelconsumed 96.5 negotiation rounds (RDbaseline = 96.5) to deal with the various ne-gotiation sessions, and the GA-based negotiation agents took 78.9 negotiation rounds(RDGA = 78.9) to complete the negotiation sessions. The proposed negotiation methodensures that “as little money is left on the table as possible”. Since the proposed GA-basednegotiation agents can observe their opponents’ preferences and continuously learn thisinformation via the opponents’ counter-offers, the search for a mutually acceptable offerbecomes faster. Moreover, as the GA-based negotiation agents were responsive to time

Page 18: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

pressure, most of the agreements could be reached on or before the deadline. On the otherhand, the basic negotiation system failed to develop some negotiation solutions givenlimited negotiation time. Although the basic negotiation system can guarantee Paretooptimal solutions, it is less useful for practical negotiations because it is not responsiveto the changing negotiation contexts. For the control negotiation group (group 1), bothsystems could successfully identify all the negotiation solutions given exactly the samepreferences of the negotiators. In fact, both systems could identify the solution in the firstround for each case in the first negotiation group. In general, the performance gap betweenthese two negotiation systems becomes larger if the preferential differences between thebuyer and the seller are bigger. The reason is that the GA-based negotiation agentsare equipped with effective learning mechanisms to learn and adapt to the opponents’preferences even though the preferential differences between the parties are big. Therefore,GA-based negotiation agents perform better in these more realistic and more challengingnegotiation sessions. The improvement in terms of the average number of negotiationrounds to complete a negotiation session achieved by the GA-based adaptive negotiationagents indicates that these agents are more efficient than those agents developed basedon the basic negotiation model. Except for the first negotiation group where both kindsof agents complete every negotiation session after the first round, the GA-based agentsgenerally complete a negotiation session faster when compared to their non-adaptivecounterparts.

Table I. Comparative negotiation performance GA vs.Baseline

Group Preferential ∆utility ∆rate ∆roundDifference

1 0% 0.0% 0.0% 0.0%

2 20% −1.6% 0.0% −36.5%

3 40% 9.8% 11.1% −31.8%

4 60% 17.2% 25.0% −15.5%

5 80% 26.1% 42.8% −7.1%

Average 10.3% 15.8% −18.2%

Experiment 2The purpose of the second experiment was to examine the capabilities of the GA-based

adaptive negotiation agents in dealing with dynamic negotiation environments (e.g., thepreferences of an agent and its opponent will change during a negotiation process). Themain hypothesis was that the GA-based negotiation agents would be responsive to thechanging negotiation preferences, and so should demonstrate a bigger performance boostwhen compared with the baseline negotiation system which cannot take into accountany changing negotiation preferences. The same set of negotiation cases and the sameagent parameters used in experiment one were adopted in this experiment. The fitnessfunction Eq.(5) was still used in the GA-based negotiation agents. The only differencebetween experiment 1 and experiment 2 was that a random preferential change (e.g., thevaluation of an attribute or attribute value) was injected to both the buyer and the seller

Page 19: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

at the end of every five negotiation rounds if an agreement had not been established inthat round. Similar to experiment 1, the interactions between the buyer and the sellerwere symmetry since both of them demonstrated changing negotiation preferences withthe same frequency. With reference to our client interface (Figure 4), a breakpoint wasset to 5 (i.e., a negotiation process continues for 5 rounds and then stops) and then thevaluation values were modified. Both the GA-based negotiation agents and the baselinesystem were exposed to the same set of negotiation cases with preferential changes. Forthe baseline model, it cannot take into account any changing preferences, and so thenegotiation processes were not affected. However, when the payoff of a buyer or a sellerwas computed at the end of a negotiation process, the most up-to-date utility function(due to the changing preferences of a negotiator) was used to compute an agent’s payoff.

Table II summarizes the average ∆utility and ∆rate for each negotiation group. Anoverall results of ∆utility = 31.6% and ∆rate = 15.8% were obtained and our secondhypothesis was confirmed. Apart from the control group (group 1), the performance of theGA-based adaptive negotiation agents was much better than that of the baseline system.For the control group, both systems could successfully finish all the negotiations in roundone before any preferential changes were introduced. Therefore, the results of group onewas the same as that obtained in experiment one. It was obvious that the baseline systemcould not find the right solutions for most of the cases of the remaining groups. On theother hand, the GA-based negotiation agents were still performing well because theycould adapt to the preferential changes and were able to reach agreements before thedeadline for all the cases. Overall, this experiment shows that the GA-based adaptivenegotiation mechanism is quite effective under dynamic negotiation environment.

Table II. Comparative Negotiation Perfor-mance in Dynamic Environment

Group Preferential ∆utility ∆rateDifference

1 0% 0.0% 0.0%

2 20% 21.6% 0.0%

3 40% 32.8% 11.1%

4 60% 42.3% 25.0%

5 80% 61.5% 42.8%

Average 31.6% 15.8%

Experiment 3The purpose of the third experiment was to examine the concession matching mecha-

nisms of the GA-based negotiation agents. This experiment only involved the GA-basedadaptive negotiation agents. The control group was the adaptive agents without the con-cession matching mechanisms Eq.(6) activated, while the experimental group consisted ofagents with the concession matching mechanisms activated. The parameter λ = 3 was setwhen the agents used the concession matching tactics. Ten negotiation cases from group 4which were used in experiment one and two were adopted in this experiment. To simulatedifferent encounters such as Boulware-to-Boulware, Boulware-to-Conceder, Conceder-to-

Page 20: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

Boulware, and Conceder-to-Conceder, the buyer agent and the seller agent took therespective roles in turn. As a whole, it was a 9 × 2 × 10 factorial design which involved180 simulation runs since there were 9 types of encounters, 2 kinds of agents (concessionmatching mechanisms activated or not), and 10 negotiation cases. For a Boulware agent,the eagerness factor [41, 42, 43, 45] ep = 0.5 was set. For a Conceder agent, the eagernessfactor was ep = 5 and the Linear agent had ep = 1. In this experiment, the average fairnessratio was used to evaluate the performance of our concession matching mechanism. Thefairness ratio represents the degree of equity in the negotiation and is computed bytaking the ratio of the seller’s individual payoff to the buyer’s individual payoff [27]. Allthe genetic parameters and agent parameters were the same as the previous experimentsexcept for those we mentioned above. Our hypothesis in this experiment was that thenegotiation agents with the concession matching mechanisms activated should producemore robust results, that is, the fairness ratios should be higher in all kinds of encounters.The relative performance of the agents with concession matching tactics was measuredusing the following metric:

∆Fair =FRCMT − FRNCMT

FRNCMT× 100%

where FRCMT denotes the average Fairness Ratio for agents with Concession MatchingTactics employed, and FRNCMT represents the average Fairness Ratio for agents withoutConcession Matching Tactics employed.

Table III depicts the average ∆Fair for the 9 encounters. A positive ∆Fair indicatesthat a fairer result was achieved. As can be seen, a better result (in terms of a higherfairness ratio) was produced when our concession matching mechanisms were appliedto negotiations, and so our third hypothesis was confirmed. The reason was that theagents tried to mimic and adapt to each other during negotiation. Consequently, a faireroutcome was produced. For example, when a Boulware agent negotiated with a Concederagent, the Boulware agent learned to be less Boulware. Meanwhile, the Conceder agentlearned to be tougher (i.e., less conceding). Therefore, their concession making behaviourconverged gradually and the resources were more evenly distributed. If both agents wereof the same type, their positions remained the same, but the tougher Boulware becameless Boulware while the tender Boulware became tougher. As a result, their concessionbehaviour also converged.

Table III. The Effect of Learning ConcessionMatching Behaviour

Agent Trait Boulware Conceder Linear

Boulware 1.3% 9.5% 3.3%

Conceder 10.4% 0.9% 1.5%

Linear 3.1% 1.8% 0.6%

Experiment 4The purpose of the fourth experiment was to examine the interactions among different

kinds of agent attitudes (e.g., self-interested to benevolent). In addition, we would liketo evaluate if our adaptive negotiation mechanism Eq.(5) could improve the negotiation

Page 21: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

outcome under extreme agent attitudes. Our hypothesis was that the adaptive negotiationmechanism which could take into account approaching deadline would lead to higher suc-cess rate in extreme encounters such as fully self-interested attitude to fully self-interestedattitude. From the literature, it has been indicated that fully self-interested agents donot necessarily achieve good negotiation outcomes (e.g., the Prisoner’s Dilemma) [37, 47].Through our simulated negotiations, it is interesting to see if certain agent attitudes wouldgenerally lead to better outcome under realistic conditions (e.g., limited information, timepressure). Ten negotiation cases in group 4 which were used in experiment one and twowere employed in this experiment. For the first series of simulation runs, the fitnessfunction Eq.(1) was employed so that the agents’ attitudes were fixed in each negotiationsession. For the second series of simulation runs, the fitness function Eq.(5) was employedso that the agents’ attitudes were adjusted with respect to the time pressure. Basically, itwas a 5×5×2×10 factorial design. There were 5 buyer attitudes (α = 0.2, . . . , α = 1.0),5 seller attitudes, and for each interaction, there were 10 negotiation cases. This exper-imental setting was repeated twice, one for non-adaptive agents, and the another foragents that were responsive to time pressure.

Joint-payoff (JP ) and success rate (SR) were used to measure the agents’ performancewith respect to various agent attitudes. For this experiment, if an agreement could notbe reached before the deadline (100 rounds), the case was ignored in computing theaverage joint-payoff since we would like to clearly examine if a particular interaction ofagent attitudes promoted joint-payoff. The genetic parameters were the same as thoseused in the previous experiments, but the trade-off factor varied in the range of [0.2, 1].For each pair of agent attitude such as αbuyer = 0.2 and αseller = 0.2, the 10 negotiationcases were executed to obtain the average joint-payoff and the average success rate within100 negotiation rounds. Figure 6.a and Figure 6.b show the average joint-payoff and theaverage success rate for various interactions of agent attitudes when the agents were notresponsive to time pressure. In addition, Figure 6.c and Figure 6.d show the averagejoint-payoff and the average success rate for the same set of interactions while the agentswere responsive to time pressure.

By comparing Figure 6.a and Figure 6.b, it should not be difficult to find that the aver-age joint-payoff was the lowest when both agents were benevolent (αbuyer = αseller = 0.2).However, the average success rate was the highest at this point. On the other hand, whenboth agents were fully self-interested (αbuyer = αseller = 1), the average joint-utility wasthe highest although the average success rate was the lowest (average success rate =0.1). So, our experiment confirmed the findings reported in the literature. Even thoughthere is a temptation for agents to be fully self-interested (e.g., achieving a high payoffas shown in our simulation), they may in fact be worse off since most of these encounterscannot lead to agreements. From Figure 6.c and Figure 6.d, it also confirms that ourproposed GA-based adaptive negotiation mechanism, which can take into account thetime pressure, leads to better outcome in terms of success rate. The average success rateswere substantially improved under various encounters while the joint-payoffs remainedmore or less the same when compared with the results produced by the agents whichwere not responsive to time pressure (Figure 6.a and Figure 6.b). Therefore, the fourthhypothesis is supported based on this experiment. In fact, not only the outcomes ofthe extreme encounters was improved by employing the proposed adaptive negotiationmechanism that takes into account of time pressure Eq.(5) but also the outcomes of mostof the agent encounters.

Page 22: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

0.200.40

0.600.80

1.00

0.20

0.40

0.60

0.80

1.00

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1.40

1.60

Avearge Joint-Payoff

Buyer's trade-off factor

Seller's trade-off factor

0.20 0.40 0.60 0.80 1.00

0.20

0.40

0.60

0.80

1.00

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

Success Rate

Buyer's Trade-off factor

Seller's Trade-off factor

a. Average Joint-Payoff (Non-adaptive) b. Average Success Rate (Non-adaptive)

0.200.40

0.600.80

1.00

0.20

0.40

0.60

0.80

1.00

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1.40

Avearge Joint-Payoff

Buyer's trade-off factor

Seller's trade-off factor

0.20 0.40 0.60 0.80 1.00

0.20

0.40

0.60

0.80

1.00

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

Success Rate

Buyer's Trade-off factor

Seller's Trade-off factor

c. Average Joint-Payoff (Adaptive) d. Average Success Rate (Adaptive)

Figure 6. The Interactions of Agent Attitudes

Experiment 5The purpose of the fifth experiment was to explore multi-lateral negotiation situations

which consists of two buyer agents (B1, B2) and two seller agents (S1, S2). These agentsnegotiated over some virtual services or products described by five attributes (i.e., |A| =5) with each attribute domain containing 5 discrete values as described before. Eachnegotiation case in this experiment was defined in terms of the valuation functions UA

p

and UDap for each agent p participating in the negotiation process. Each buyer (seller)

participating in a negotiation process was assumed to have a product to buy (sell). Similarto the previous experiments, an agreement zone always exists in a negotiation case.

A synchronised alternate-offering protocol was used in this experiment. At the be-ginning of every negotiation round, each agent would execute its genetic algorithm togenerate an offer for that round. At the message exchange phase, each agent sent theoffer messages to each of its opponents (e.g., S1 → B1, S1 → B2) according to a pre-defined order. After the message exchange phase in each negotiation round, the simulationcontroller would randomly serialise the offer evaluation processes in a particular sequence

Page 23: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

(e.g., < B2, B1, S1, S2 >). Then, each agent selected the best incoming offer (evaluatedaccording to its private utility function) as the opponent offer oopponent in a negotiationround. If there was a tie, an opponent would be selected randomly. As a result, each agentcan determine if the oopponent should be accepted or not with reference to its proposalin the current round. If an agreement was made between a pair, they would be removedfrom the negotiation table, and the remaining two agents would continue their negotiationuntil either an agreement was made or the deadline was due.

Table IV. Comparative Performance (GA) vs. (Baseline) in Mul-tilateral Negotiations

(GA) (Baseline)

Agent Average Average Average Average

Payoff Time (rounds) Payoff Time (rounds)

B1 0.62 67.9 0.18 96.2

B2 0.59 67.6 0.19 95.8

S1 0.47 67.9 0.13 95.9

S2 0.48 68.1 0.11 96.1

Ten negotiation cases were developed according to various valuation functions UAp

and UDap assigned to the four agents. The payoff obtained by each agent from every

negotiation process was recorded. If no agreement was made after the deadline, the payoffobtained by an agent was zero. The average payoff obtained by each agent and the averagenegotiation time (in rounds) consumed over the ten negotiation cases are depicted underthe (GA) columns in Table IV. The negotiation trade-off factors α = 0.8 and α = 0.5 wereapplied to the buyer agents and the seller agents respectively. In addition, the eagernessfactor [41, 42, 43, 45] ep = 1 was employed by each agent.

The same set of negotiation cases was assigned to the agents developed based on thebasic negotiation mechanism defined in Section 3. The results of these agents are listedunder the (Basic) columns in Table IV. From Table IV, it is obvious that the GA-basedadaptive negotiation agents again out-perform the negotiation agents which guaranteePareto optimal in terms of both average payoff (effectiveness) and average negotiationtime (efficiency). One of main reasons is that the the non-learning agents cannot findnegotiation solutions (i.e., agreements) in many cases under a tough deadline of 100negotiation rounds. On the contrary, even though the (GA) agents cannot guaranteePareto optimal in general, they are sensitive to negotiation deadlines and are adaptivewith respect to their opponents’ negotiation behaviour. Therefore, agreements can bemade in most of the negotiation cases.

6. Related work

Many researchers in automated negotiation [14, 25, 49] have stressed the necessity and im-portance of developing adaptive negotiation mechanisms which allow negotiation agents

Page 24: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

to learn their opponents’ preferences and concession making behaviour in dynamic nego-tiation environments. However, existing learning approaches for automated negotiationsuch as Bayesian learning [5, 50] and Case-Based Reasoning (CBR) [46, 52] are stillprimitive in terms of what can be learned (e.g., price only) and their sensitivity to thechanging negotiation contexts. Recently, there have been several proposals and implemen-tations which employ genetic algorithms to empower negotiation agents with learning andadaptation capabilities [1, 6, 15, 27, 28, 39].

Naive Bayesian classifier has been proposed to develop the learning mechanisms ofnegotiation agents in multiagent systems [5]. It is argued that learning capabilities ofautonomous agents are essential since agents may not be willing to share their preferencesand the communication costs of exchanging full preferences among the agents could behigh. Negotiation is treated as a process in which the agents refine their joint agreementset from the initial set of all feasible agreements to a single final agreement. Basically,such a refinement process is modelled as hill-climbing search in an agreement tree. Theroot node of the agreement tree contains all feasible agreements and all the leaf nodes aresingleton sets. The set of all children of an intermediate node is a partition of that node.Negotiation is a coordinated search through the tree to find a leaf (agreement) acceptableby all the agents. The search involves two stages, asynchronous refinement and collectiverefinement. In the asynchronous refinement stage, an agent tries to maximise the socialwelfare (global utility) based on its own refinement bias function (i.e., a utility function).If all agents propose exactly the same refinement (i.e., agreement set), the particularrefinement will be adopted as the solution at the collective refinement stage; otherwisethe agents exchange their preferences and attempt to find an alternative refinement tomaximise the group’s payoff at the collective refinement stage. This process may involvea back-tracking to a node at a higher level if all the nodes below the current level do notlead to a joint agreement.

As can be seen, the main issue lies on the refinement bias function used by an agentat the asynchronous refinement stage. A refinement bias function (i.e., a utility function)can be expressed by: Ua

group(o) = Ua(o)+∑

b∈{A−a} Ub(o), where Ugroup(o) is an agent a’sperception of the joint payoff for an offer o, and Ua(o) is agent a’s own utility function.∑

b∈{A−a} Ub(o) represents agent a’s view of the joint payoff perceived by other agents andA is the set of agents participating in negotiation. In the extreme case, an agent may adoptthe ignorance approach such that Ua

group(o) = |A|×Ua(o) is computed. In other words, theagent only used its own utility function Ua(o) to replace other agents’ utility functions.However, this approach may lead to much conflict at the collective refinement stage.An agent can learn others’ preferences based on their previous negotiation history. In

particular, the Naive Bayesian classifier Pr(ci|x) =∏

jPr(ej |ci)Pr(ci)

Pr(x , where ci representsthe discretised utility value (i.e., a class) and ej is one of the features of the object (i.e.,offer) x. In other words, given an offer x, the probability of it generating a payoff ci foran agent is approximated by the prior probabilities (e.g., Pr(ej |ci) and Pr(ci)) whichcould be obtained from the past negotiation history. The expected payoff of an offeras perceived by an agent b is: U exp

b (o) =∑

i Pr(ci|o) × ci, where ci is the mean of thediscretised utility ci. Through simulated experiments, it was found that the agents withlearning mechanisms underpinned by the Naive Bayesian classifier performed better thanthose agents without the capabilities of learning others preferences in terms of reducedsolution cost (i.e., higher joint payoff) and reduced number of messages exchanged duringthe negotiation process. Nevertheless, one important issue which was not explained clearly

Page 25: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

in the paper [5] is how the prior probability Pr(ej |ci is obtained. The prior probabilityPr(ej |ci implies that the opponents need to disclose their preference (utility) values forsome negotiation issues during the previous negotiation process. Such a requirement maycontradict the original assumption of the negotiation process in that agents tend to keeptheir preference information private.

It has been argued that the challenge of research in negotiation is to develop modelsthat can track the shifting tactics of negotiators [27]. Accordingly, a genetic algorithmbased negotiation mechanism is developed to model the dynamic concession matchingbehaviour arising in bi-lateral negotiation situations [27]. The set of feasible offers of anegotiation agent is represented by a population of chromosomes. The fitness of eachchromosome (i.e., a feasible offer) is measured by the fitness function: fitness(o) =Pself − λ× Popponent, which is derived based on Social Judgement Theory [4]. Pself andPopponent represent the payoffs of an agent and its opponent respectively. The parameterλ ∈ {1,−1} indicates if a negotiation agent is situated in a competitive or a cooper-ative situation. The parameter λ is established according to the opponent’s concessionmatching behaviour in previous negotiation rounds. For example, if the opponent doesnot concede at all in previous rounds, it indicates that an agent is faced with a highlycompetitive situation and so λ should be set to 1 to penalise the opponent. Based onthe standard genetic operators such as selection, crossover, and mutation, a populationof chromosomes evolves over time. After a pre-defined number of evolutions, the fittestchromosome from the current generation is chosen as a tentative solution (i.e., a counter-offer). If an agent receives an incoming offer that is perceived to yield a higher payoffthan that of its opponent, it will accept the offer. The main drawback of this negotiationmodel is that it requires the knowledge about the opponent’s utility function to computePopponent and to evaluate an incoming (counter-)offer. The system is not equipped witha learning mechanism to learn the opponent’s preferences; it is assumed that an agentand its opponent have the same utility function. The GA-based multi-agent multi-issuenegotiation mechanism proposed in this paper does not rely on the assumption of theopponents’ negotiation preferences. Moreover, the proposed negotiation mechanism isadaptive in the sense that negotiators’ changing preferences and negotiation behaviourare taken into account by the proposed GA.

An agent-based multi-issue negotiation model has been developed based on a geneticalgorithm [6]. Essentially, this negotiation model is mainly based on Krovi’s work [27]and focuses on learning an opponent’s concession matching behaviour. However, the maindifference between these systems lies on the computation of the λ parameter. Choi et.al.’s negotiation mechanism [6] seems to demonstrate a large departure from the SocialJudgment Theory. Though a learning mechanism based on stochastic approximation isproposed to learn the opponent’s negotiation preferences, the given formulas do not showclearly how the weight of a negotiation issue is derived. To strike for a better balancebetween exploitation-oriented and exploration-oriented genetic search, adaptive mutationoperator is proposed.

Rubenstein-Montano and Malaga have also reported a GA-based negotiation mech-anism for searching optimal solutions for multiparty multi-objective negotiations [39].Basically, a negotiation problem is treated as a multi-objective optimisation problem.Apart from the standard genetic operators such as selection, crossover, and mutation,the GA is enhanced with a new operator called trade. The trade operator simulates aconcession making mechanism which is often used in negotiation systems. A chromosomeis used to represent a payoff matrix (e.g., each column represents a negotiation party

Page 26: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

and each row represents a negotiation issue in the matrix). The negotiation performanceof the GA enhanced by the trade operator is compared to other approaches such as asimilar GA without the trade operator, a nonlinear programming method, a hill-climber,and a random searcher. Their experimental results show that the GA with the tradeoperator performs better than the other approaches in terms of maximising the jointpayoffs in distributive negotiation scenarios involving four issues. However, the mainproblem of this particular GA-based negotiation mechanism is that the preferences (i.e.,the utility functions) of all the negotiation parties are assumed available to a centralnegotiation mechanism. Moreover, the preferences of the negotiation parties are assumedunchanged during a negotiation process. Because of these assumptions, such a negotia-tion mechanism seems to have very limited use in real-world negotiation environmentssuch as e-Business where a negotiator’s preferences are often kept private. Our proposednegotiation mechanism does not assume complete knowledge about a negotiation space.Instead, the GA-based adaptive negotiation mechanism can observe and gradually learnthe opponents’ negotiation preferences.

Genetic algorithm has also been applied to learn effective rules to support the nego-tiation processes [33]. A chromosome represents a negotiation (classification) rule ratherthan an offer. Each classification rule has the pattern < p, s >:< p

′, s

′>, where p and s

represent the purchaser’s current offer and the seller’s current offer respectively, and p′

and s′are the subsequent offer of the purchaser and the subsequent counter-offer of the

seller. The fitness of a chromosome (a rule) is measured in terms of how many times therule has contributed to reach an agreement. If a rule is fired and subsequently it leads toan agreement, one unit of fitness is added to the rule. Standard genetic operators suchas roulette wheel selection, crossover, mutation are used to evolve populations. In orderfor the system to determine if an agreement is possible, each negotiator’s preferencesincluding the reservation values of the negotiation attributes are assumed known orhypothesised. Therefore, this approach also suffers from the same problem as that ofthe other methods which assume complete information about negotiation spaces.

Instead of using the evolutionary approach to develop an agent’s concession generationmechanism, a GA has been used to learn optimal negotiation tactics given a particularnegotiation situation (e.g., a predefined amount of negotiation time) [32]. The negotia-tion tactics such as time-dependent tactics, resource-dependent tactics, and behaviour-dependent tactics are examined. A negotiation agent and its tactics, in particular, thevarious parameters associated with these tactics, are represented by a chromosome. Tocompute the fitness of a chromosome, an agent needs to negotiate with each partner in amarket-place to obtain the average joint-payoff. Then, such a joint-payoff is compared tothat obtained at the theoretical Nash equilibrium point. It is found that using a weightedcombination of tactics to determine the concession value of each negotiation issue leads tomore robust results (e.g., in terms of stable joint-payoffs) under varying negotiation situ-ations. Similar approach has also been applied to find the optimal negotiation parametersfor a set of pre-defined negotiation tactics under a case-based negotiation architectureand a fuzzy rule-based negotiation architecture respectively [31]. Nevertheless both ofthese approaches [32, 31] suffer from the serious limitation in that they only work undera centralised decision making mechanism where complete information about each nego-tiator’s preferences is available. On the contrary, the approach discussed in this paperis based on the assumption of a decentralised decision making model where each agentmakes its own negotiation decision according to the respective GA. Such an assumptionbetter reflects the real-world negotiation situations such as negotiations for e-Business.

Page 27: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

Similarly, a GA is also used to study the bargaining behaviour of boundedly rationalagents in a single issue (e.g., price) bi-lateral negotiation situation. The empirical resultsproduced by the GA are compared to the equilibrium outcomes generated by a game-theoretic analysis [13]. Each chromosome encodes an agent’s strategy which consists offour elements: an agent’s initial price, the final price, the eagerness factor simulating awide spectrum of concession behaviour (e.g., extreme Boulware to extreme Conceder),and the agent’s negotiation deadline. The opponent’s reservation price is assumed knownand is used to establish an agent’s initial price. Only the agent’s final price and theeagerness factor will be evolved in a competitive co-evolution process which involves twosub-populations representing a set of buyers and sellers respectively. The average utilityachieved by an agent (after applying its particular strategy encoded on a chromosome) isused to measure the fitness of a chromosome (i.e., a strategy). In fact, the GA is not usedto build the agents’ decision-making mechanisms but to learn the negotiation strategieswhich lead to optimal joint utilities. The conclusion is that a stable state produced bythe evolutionary model does not always match an equilibrium outcome generated bythe game-theoretic model. However, if both the buyer agent and the seller agent have anegative utility discount factor (i.e., worse off for delayed agreement), the stable outcomegenerated by the GA model always matches the equilibrium outcome identified by thegame-theoretic model.

In many real-life negotiation settings such as business process management and e-Commerce, negotiation agents have only limited information about their opponents andlimited computational resources (bounded rationality) to deliberate solutions. Undersuch circumstance, it is impossible to predict or specify equilibrium strategies a pri-ori. Therefore, employing a heuristic learning approach to develop the adaptive decisionmaking mechanisms of negotiation agents is desirable. A fuzzy similarity-based trade-offmechanism is developed to search for near optimal negotiation solutions under the con-straints imposed by real-world applications [40, 11]. Conceptually, the set of feasible offersisoα(θ) = {x|Uα(x) = θ} lying on the indifference curve for a given utility aspiration levelθ of an agent α is first identified. Then, a hill-climbing algorithm is applied to learn thetrade-off strategy, that is, an offer z = trade-offα(x,y) = arg maxz∈isoα(θ){Sim(z,y)},such that z is most similar to the opponent’s latest offer y to maximise the chance of reach-ing an agreement. The usual interpretation of a fuzzy similarity function Simj(xj , yj)is: Simj(xj , yj) = min1≤i≤m(hi(xj) ↔ hi(yj)), where xj , yj are the issues subject tocomparison, and hi(xj), hi(yj) are the corresponding criteria functions. The notation↔ represents a fuzzy equivalence operator. Nevertheless, Faratin et. al.’s implementa-tion [11] adopts a weighted mean approach to define the fuzzy similarity operation:Simj(xj , yj) =

∑1≤i≤m wi × (hi(xj) ↔ hi(yj)). Experimental results confirm that the

heuristic algorithm is effective in generating trade-offs in a range of negotiation contexts.

7. Conclusions

Real-world negotiation scenarios such as those found in B2B environment are char-acterised by combinatorially complex negotiation spaces, tough negotiation deadlines,limited information about the opponents, and volatile negotiator preferences. Therefore,practical negotiation systems must be equipped with effective learning mechanisms toautomatically acquire domain knowledge from the negotiation environments and con-tinuously adapt to the dynamic negotiation contexts. Our proposed GA-based adaptive

Page 28: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

negotiation agents are empowered by the effective evolutionary learning mechanisms suchthat these agents can learn the opponents’ preferences gradually and continuously adaptto the changing negotiation contexts. The design of our GA-based adaptive negotiationagents fulfils all the requirements of practical negotiation systems because these agentscan mimic a wide spectrum of negotiation attitudes, identifying near optimal solutionsbased on limited information about the negotiation spaces, continuously learning theopponents’ preferences, and adapting to the changing negotiation contexts. Our em-pirical study shows that under realistic negotiation conditions, the GA-based adaptivenegotiation agents outperform a theoretically optimal negotiation model which generallyguarantees Pareto optimal. In addition, the proposed GA-based adaptive negotiationmechanism that takes into account of time pressure can improve the negotiation outcomes(e.g., joint payoff and success rate) under various agent encounters. Our research workopens the door to the development of practical negotiation mechanisms for real-worldapplications. Future work includes the enhancement of our existing genetic algorithm byusing adaptive genetic operators and by taking into account the market-oriented factorsin multi-lateral negotiations [41, 42, 43, 45].

References

1. Baber, V., R. Ananthanarayanan, and K. Kummamuru: 2002, ‘Evolutionary Algorithm Approachto Bilateral Negotiations’. In: E. Lutton, J. A. Foster, J. Miller, C. Ryan, and A. G. B. Tettamanzi(eds.): Proceedings of the 4th European Conference on Genetic Programming, EuroGP 2002, Vol.2278 of Lecture Notes in Computer Science. Kinsale, Ireland, pp. 203–212.

2. Barbuceanu, M. and W.-K. Lo: 2001, ‘Multi-attribute Utility Theoretic Negotiation for ElectronicCommerce’. In: F. Dignum and U. Cortes (eds.): Agent-Mediated Electronic Commerce III — CurrentIssues in Agent Based Electronic Commerce Systems. pp. 15–30.

3. Bazerman, M., T. Magliozzi, and M. Neale: 1985, ‘Integrative bargaining in a competitive market’.Organization Behavior and Human Decision Processes 35, 294–313.

4. Brehmer, B. and K. Hammond: 1977, ‘Cognitive factors in interpersonal conflict’. In: D. Druckman(ed.): Negotiations: Socio-Psychological Perspectives. Sage, pp. 216–270.

5. Bui, H. H., S. Venkatesh, and D. Kieronska: 1999, ‘Learning Other Agents’ Preferences in MultiagentNegotiation using the Bayesian Classifier’. International Journal of Cooperative Information Systems8(4), 275–293.

6. Choi, S. P. M., J. Liu, and S.-P. Chan: 2001, ‘A genetic agent-based negotiation system’. ComputerNetworks 37(2), 195–204.

7. de Jong, E. D. and J. B. Pollack: 2004, ‘Ideal Evaluation from Coevolution’. EvolutionaryComputation 12(2), 159–192.

8. Duda, R. and P. Hart: 1973, Pattern Classification and Scene Analysis. New York, New York: JohnWiley & Sons.

9. Ehtamo, H., E. Ketteunen, and R. Hamalainen: 2001, ‘Searching for joint gains in multi-partynegotiations’. European Journal of Operational Research 130, 54–69.

10. Faratin, P., C. Sierra, and N. R. Jennings: 1998, ‘Negotiation Decision Functions for AutonomousAgents’. Journal of Robotics and Autonomous Systems 24(3-4), 159–182.

11. Faratin, P., C. Sierra, and N. R. Jennings: 2002, ‘Using Similarity Criteria to Make Issue Trade-offsin Automated Negotiations’. Artificial Intelligence 142(2), 205–237.

12. Faratin, P., C. Sierra, N. R. Jennings, and P. Buckle: 1999, ‘Designing Responsive and DeliberativeAutomated Negotiators’. In: Proceedings of the AAAI Workshop on Negotiation: Settling Conflictsand Identifying Opportunities. Orlando, FL, pp. 12–18.

13. Fatima, S., M. Wooldridge, and N. R. Jennings: 2003, ‘Comparing Equilibria for Game-Theoretic andEvolutionary Bargaining Models’. In: Proceedings of the International Workshop on Agent-MediatedElectronic Commerce V. Melbourne, Australia, pp. 70–77.

14. Fatima, S., M. Wooldridge, and N. R. Jennings: 2004, ‘An agenda based framework for multi-issuesnegotiation’. Artificial Intelligence 152(1), 1–45.

Page 29: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

15. Gerding, E., D. van Bragt, and H. L. Poutre: 2003, ‘Multi-Issue Negotiation Processes byEvolutionary Simulation, Validation and Social Extensions’. Computational Economics 22, 39–63.

16. Goldberg, D.: 1989, Genetic Algorithms in Search, Optimisation and Machine Learning. Reading,Massachusetts: Addison-Wesley.

17. Guttman, R., A. Moukas, and P. Maes: 1998, ‘Agent-mediated Electronic Commerce: A Survey’.Knowledge Engineering Review 13(2), 147–159.

18. Harsanyi, J. and R. Selten: 1972, ‘A generalised nash solution for two-person bargaining games withincomplete information’. Management Sciences 18(5), 80–106.

19. He, M., N. R. Jennings, and H. Leung: 2003, ‘On agent-mediated electronic commerce’. IEEE Trans.on Knowledge and Data Engineering 15(4), 985–1003.

20. Heiskanen, P.: 1999, ‘Decentralized method for computing Pareto solutions in multiparty negotia-tions’. European Journal of Operational Research 117, 578–590.

21. Jennings, N. R., P. Faratin, A. R. Lomuscio, S. Parsons, C. Sierra, and M. Wooldridge: 2001, ‘Auto-mated negotiation: prospects, methods and challenges’. Journal of Group Decision and Negotiation10(2), 199–215.

22. Jonker, C. and J. Treur: 2001, ‘An Agent Architecture for Multi-Attribute Negotiation’. In: B.Nebel (ed.): Proceedings of the seventeenth International Joint Conference on Artificial Intelligence(IJCAI-01). pp. 1195–1201.

23. Keeney, R. and H. Raiffa: 1993, Decisions with Multiple Objectives: Preferences and Value Tradeoffs.Cambridge, UK: Cambridge University Press.

24. Kraus, S.: 1997, ‘Negotiation and Cooperation in Multi-Agent Environments’. Artificial Intelligence94(1–2), 79–97.

25. Kraus, S., K. Sycara, and A. Evenchik: 1998, ‘Reaching Agreements through Argumentation: ALogical Model and Implementation’. Artificial Intelligence 104(1–2), 1–69.

26. Krause, P., S. Ambler, M. Elvang-Goransson, and J. Fox: 1995, ‘A logic of argumentation forreasoning under uncertainty’. Computational Intelligence 11, 113–131.

27. Krovi, R., A. Graesser, and W. Pracht: 1999, ‘Agent Behaviors in Virtual Negotiation Environments’.IEEE Transactions on Systems, Man, and Cybernetics 29(1), 15–25.

28. Lau, R.: 2002, ‘Evolutionary Approach for Adaptive Negotiation Agents in B2B E-commerce’. In:L. Wang, K. C. Tan, T. Furuhashi, J.-H. Kim, and X. Yao (eds.): Proceedings of the 4th Asia-PacificConference on Simulated Evolution and Learning. Singapore, pp. 577–581.

29. Lau, R., M. Tang, and O. Wong: 2004, ‘Towards Genetically Optimised Responsive NegotiationAgents’. In: Proceedings of the 4th IEEE/WIC International Conference on Intelligent AgentTechnology. Beijing, China, pp. 295–301.

30. Lomuscio, A. R. and N. R. Jennings: 2003, ‘A Classification Scheme for Negotiation in ElectronicCommerce’. Journal of Group Decision and Negotiation 12(1), 31–56.

31. Matos, N. and C. Sierra: 1998, ‘Evolutionary Computing and Negotiating Agents’. In: P. Noriegaand C. Sierra (eds.): Proceedings of the First International Workshop on Agent Mediated ElectronicTrading (AMET-98), Vol. 1571 of Lecture Notes in Artificial Intelligence. Minneapolis, MN, pp.126–135.

32. Matos, N., C. Sierra, and N. R. Jennings: 1998, ‘Determining successful negotiation strategies: anevolutionary approach’. In: Y. Demazeau (ed.): Proceedings of the 3rd International Conference onMulti-Agent Systems (ICMAS-98). Paris, France, pp. 182–189.

33. Matwin, S., T. Szapiro, and K. Haigh: 1991, ‘Genetic Algorithm Approach to a Negotiation SupportSystem’. IEEE Transactions on Systems Man and Cybernetics 21(1), 102–114.

34. Nash, J. F.: 1950, ‘The Bargaining Problem’. Econometrica 18, 155–162.35. Parsons, S., C. Sierra, and N. Jennings: 1998, ‘Agents that Reason and Negotiate by Arguing’.

Journal of Logic and Computation 8(3), 261–292.36. Pruitt, D.: 1981, Negotiation Behaviour. Academic Press.37. Raiffa, H.: 1982, The Art and Science of Negotiation. Harvard University Press.38. Rosenschein, J. and G. Zlotkin: 1994, ‘Task Oriented Domains’. In: Rules of Encounter: Designing

Conventions for Automated Negotiation among Computers. Cambridge, Massachusetts: MIT Press,pp. 29–52.

39. Rubenstein-Montano, B. and R. A. Malaga: 2002, ‘A Weighted Sum Genetic Algorithm to SupportMultiple-Party Multi-Objective Negotiations’. IEEE Transactions on Evolutionary Computation6(4), 366–377.

Page 30: COVER SHEET - COnnecting REpositories · 2013-07-02 · An Evolutionary Learning Approach for Adaptive Negotiation Agents Raymond Y.K. Lau, Maolin Tang, On Wong and Stephen W. Milliner

40. Sierra, C., P. Faratin, and N. R. Jennings: 1999, ‘Deliberative Automated Negotiators Using FuzzySimilarities’. In: Proceedings of the EUSFLAT-ESTYLF Joint Conference on Fuzzy Logic. Palmade Mallorca, Spain, pp. 155–158.

41. Sim, K.: 2004, ‘Negotiation Agents that make prudent compromises and are slightly flexible’.Computational Intelligence, Special Issue on Business Agents and the Semantic Web pp. 643–662.

42. Sim, K. and C. Choi: 2003, ‘Agents that react to changing market situations’. IEEE Transactionson Systems, Man and Cybernetics, Part B: Cybernetics 33(2), 188–201.

43. Sim, K. and S. Wang: 2004, ‘Flexible Negotiation Agent with Relaxed Decision Rules’. IEEETransactions on Systems, Man and Cybernetics, Part B: Cybernetics 34(3), 1602–1608.

44. Sim, K. M.: 2002, ‘A Market-driven Model for Designing Negotiation Agents’. ComputationalIntelligence, Special Issue in Agent Technology for E-commerce 18(4), 618–637.

45. Sim, K. M.: 2003, ‘Equilibrium Analysis of Market-driven Agents’. ACM SIGECOM: E-commerceExchange 4.2, 32–40.

46. Sycara, K.: 1989, ‘Multi-agent compromise via negotiation’. In: L. Gasser and M. Huhns (eds.):Distributed Artificial Intelligence II. Morgan Kaufmann, pp. 119–139.

47. von Neumann, J. and O. Morgenstern: 1994, The Theory of Games and Economic Behaviour.Princeton University Press.

48. Wooldridge, M. and N. Jennings: 1995, ‘Intelligent Agents: Theory and Practice.’. KnowledgeEngineering Review 10(2), 115–152.

49. Zeng, D. and K. Sycara: 1997, ‘Benefits of Learning in Negotiation’. In: Proceedings of the 14th Na-tional Conference on Artificial Intelligence and 9th Innovative Applications of Artificial IntelligenceConference (AAAI-97/IAAI-97). Menlo Park, pp. 36–42.

50. Zeng, D. and K. Sycara: 1998, ‘Bayesian Learning in Negotiation’. International Journal of Human-Computer Studies 48(1), 125–141.

51. Zeuthen, F.: 1967, Problem of Monopoly and Economic Warfare. London, U.K.: Routledge andKegan-Pail.

52. Zhang, D. and W. Wong: 2000, ‘A Web-Based Negotiation Agent Using CBR’. In: R. Kowalczyk andM. Lee (eds.): Proceedings of the Second Workshop on AI in Electronic Commerce (AIEC 2000),Vol. 2112 of Lecture Notes in Artificial Intelligence. Melbourne, Australia, pp. 183–197.