Overviewandconstructionofmeshfreebasisfunctions:From ...dilbert.engr.ucdavis.edu/~suku/maxent/papers/maxent...and ﬁnite element and meshfree methods to name a few. In this paper,

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERINGInt. J. Numer. Meth. Engng 2007; 70:181–205Published online 17 October 2006 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/nme.1885

Overview and construction of meshfree basis functions: Frommoving least squares to entropy approximants

N. Sukumar1,∗,† and R. W. Wright2

1Department of Civil and Environmental Engineering, University of California, Davis, CA 95616, U.S.A.2Department of Mathematics, University of California, Davis, CA 95616, U.S.A.

SUMMARY

In this paper, an overview of the construction of meshfree basis functions is presented, with particularemphasis on moving least-squares approximants, natural neighbour-based polygonal interpolants, andentropy approximants. The use of information-theoretic variational principles to derive approximationschemes is a recent development. In this setting, data approximation is viewed as an inductive inferenceproblem, with the basis functions being synonymous with a discrete probability distribution and thepolynomial reproducing conditions acting as the linear constraints. The maximization (minimization) ofthe Shannon–Jaynes entropy functional (relative entropy functional) is used to unify the construction ofglobally and locally supported convex approximation schemes. A JAVA applet is used to visualize themeshfree basis functions, and comparisons and links between different meshfree approximation schemesare presented. Copyright q 2006 John Wiley & Sons, Ltd.

Received 10 August 2006; Accepted 10 August 2006

KEY WORDS: information theory; relative entropy; radial basis functions; natural neighbours; convexapproximation schemes; JAVA programming

1. INTRODUCTION

Data approximation from a scattered set of points (nodes) in Rd is required in many applications:computer graphics and visualization, image processing, regression models, supervised learning,and finite element and meshfree methods to name a few. In this paper, we focus on the constructionof approximation schemes within the convex hull of a set of nodes—convex polygons (n-gons)and scattered set of nodes are considered. Barycentric co-ordinates on irregular polygons are of

∗Correspondence to: N. Sukumar, Department of Civil and Environmental Engineering, University of California, OneShields Avenue, Davis, CA 95616, U.S.A.

†E-mail: [email protected]

Contract/grant sponsor: NSF; contract/grant number: CMS-0626481, DMS-0135345

Copyright q 2006 John Wiley & Sons, Ltd.

182 N. SUKUMAR AND R. W. WRIGHT

interest in computer graphics and geometric modelling [1–3], and they are also used in polygonalfinite element methods [4]. The scattered data approximation problem that we consider has beenstudied using moving least-squares (MLS) approximants [5], natural neighbour-based interpolants[6, 7], and radial basis functions (RBFs) [8–10]. A recent advance in this direction has been theuse of information-theoretic variational principles to construct meshfree basis functions [11–13].We elaborate on the rationale of this approach, and provide a unifying framework to view entropyapproximants. A JAVA applet is developed to visualize meshfree basis functions, with an aim toreadily discern the similarities and distinctions between different meshfree approximants.

The outline of this paper follows. We first present some of the essential properties of dataapproximations schemes, and then finite element and meshfree Galerkin methods are touchedupon. In Section 3, the main functionalities and capabilities of the JAVA applet are presented, andbasis function plots that are created using the applet appear in Sections 4 and 5. The constructionof MLS approximants and barycentric co-ordinates on irregular polygons are described in Sec-tion 4. To motivate the adoption of entropy-based approximants, the key ingredients of Bayesiantheory of probability are outlined in Section 5. In Section 5.1, we present the derivation of basisfunctions using Jaynes’s principle of maximum-entropy (MAXENT) [14, 15] as well as through itsgeneralization, the principle of minimum relative entropy (Shannon–Jaynes entropy functional)[16–18]. Entropy-based higher-order approximation schemes are proposed in Section 5.2, and weclose with a few concluding remarks in Section 6.

2. PRELIMINARIES

Consider a set of distinct nodes {xi }ni=1 in Rd . The convex hull, C⊂ Rd , is the smallest convex setthat contains all the nodes. Let us introduce a point p with co-ordinate x within C⊂ R2 (Figure 1).An approximation scheme for a scalar-valued function u(x) : C→ R can be written as

uh(x) =n∑

i=1�i (x)ui (1)

where ui are coefficients (nodal values if uh is an interpolant) and �i (x) is the basis functionof node i . If xi are the vertices of a polyhedron, then {�i }ni=1 are the barycentric co-ordinatesof p. On convex polygons, the MAXENT basis functions are a barycentric co-ordinate, but they arenon-interpolatory on a set of scattered nodes (Figure 1(d)). Maximum-entropy basis functions were

x

1 2

3

(a)

3

21

4

x

(b)

x 36

5 4

21(c)

x

3

21

4

79

65

8

(d)

Figure 1. (a) Triangle; (b) square; (c) hexagon; and (d) scattered nodes.

Copyright q 2006 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2007; 70:181–205DOI: 10.1002/nme

OVERVIEW AND CONSTRUCTION OF MESHFREE BASIS FUNCTIONS 183

introduced by Sukumar [11] and Arroyo and Ortiz [12]. In Reference [12], its link to convex analysis[19] were noted, and these approximants were referred to as convex approximation schemes. ForEquation (1) to be a linearly precise barycentric co-ordinate or convex approximation scheme [12],the basis functions must satisfy the following properties:

1. Ability to reproduce constant and linear functions exactly:

∀x,n∑

i=1�i (x)= 1 and

n∑i=1

�i (x)xi = x (2)

For second-order partial differential equations (PDEs), approximants that possess constantand linear precision are sufficient for convergence in a Galerkin method [20].

2. The {�i (x)}ni=1 are sufficiently smooth (typically C∞) in the interior of C.3. Convex combination:

�i (x)�0 ∀i, x (3)

which in conjunction with Equation (2) indicates that the �i (x) are bounded between zero andunity and satisfy the convex hull property. The non-negative condition leads to the variationdiminishing property and to positive-definite mass matrices [12]. Convex approximationschemes are not prone to the Runge phenomena [21], which occurs when using higher-orderone-dimensional Lagrange interpolation on uniform grids. In addition, optimal conditioningcan be established for non-negative basis functions [22–24]. The adoption of NURBS-basedconvex basis functions to ensure geometric exactness in finite element analysis has beenrecently introduced by Hughes and co-workers [25]. MLS approximants [5], which arewidely used in meshfree Galerkin methods, are not convex approximants since MLS basisfunctions can be negative.

Consider the Poisson equation with homogeneous Dirichlet (essential) boundary conditions:

−∇2u = f in �, u = 0 on �� (4)

where �⊂ R2 is the problem domain and �� is its boundary. In the finite element method, thedomain is partitioned into M non-overlapping subdomains (elements), �= ⋃M

k=1Tk , where Tkis either a triangle or a quadrilateral (isoparametric transformation of the elements shown inFigures 1(a) and (b)). In finite elements, basis functions are associated with each vertex (node)of Tk , and within an element, the local restriction of a basis function is known as a shapefunction.

In a Galerkin method, a weak or variational statement of the strong form, Equation (4), is used:find u ∈ H1

0 (�) such that∫�

∇(�u) · ∇u d�=∫

�f �u d� ∀�u ∈ H1

0 (�) (5)

where H10 (�) is the Sobolev space of functions in L2(�) whose derivatives are also square-

integrable, and with vanishing function values on ��, and �u denotes the first variation of u.In finite element and meshfree Galerkin methods, u(x) is approximated by uh(x) (belongs to afinite-dimensional subspace of H1

0 (�)) of the form given in Equation (1). A similar approximationis used for the test function �u. Since the nodal coefficients that appear in �uh are arbitrary,



a linear system of equations is obtained:

Ku= f, Ki j =∫

�∇�i · ∇� j d�, fi =

∫�

�i f d� (6)

In the finite element method, the shape functions �i (x) are defined on reference elements(Figures 1(a) and (b)), and through an isoparametric transformation, they are obtained on anytriangle or convex quadrilateral. Finite element basis functions satisfy three key properties, whichfacilitates the exact imposition of linear Dirichlet boundary conditions:

1. Basis functions satisfy linear reproducing conditions (linearly complete) on meshes witheither triangular or quadrilateral elements.

2. Interior nodal basis functions do not contribute at a point p that lies on the boundary of thedomain.

3. The basis functions possess the Kronecker-delta property (cardinal basis): �i (x j ) = �i j .

On meshes with convex n-gons, MAXENT basis functions are a barycentric co-ordinate and theysatisfy all the above properties [11]. Maximum-entropy basis functions meet the first and secondproperty on convex domains, but for three or more nodes that are collinear on the boundary of thedomain, a weaker Kronecker-delta property is satisfied. Extensive research in meshfree methodshas focused on modifications to MLS (e.g. use of singular weight functions or transformations)and to other meshfree approximants with the objective of constructing basis functions with theKronecker-delta property. The lack of the Kronecker-delta property on the boundary by itselfdoes not pose any limitation; the first two properties that are indicated above are the critical onesand they suffice to enable the imposition of essential boundary conditions in MAXENT meshfreemethods as in finite elements [12].

3. JAVA APPLET FOR VISUALIZATION OF MESHFREE BASIS FUNCTIONS

There have been significant advances in the use of meshfree approximants for the solution of partialdifferential equations [26–34]. However, in spite of the maturity of meshfree methods, there arecurrently no tools in the public-domain to visualize meshfree basis functions. We have developedJAVA applets to visualize basis functions in one, two, and three dimensions; the two-dimensionalJAVA applet menu is shown in Figure 2. The applets serve as a suitable aid to readily discern thesimilarities and distinctions between the different meshfree approximation schemes.

The creation of a web-accessible JAVA programme allows users to create an arbitrary nodalset (convex or non-convex) in one, two or three dimensions by inputting the co-ordinates of itsnodes, with the aid of a direct data entry form or a point-and-click interface. Options such as P(polygon), Q (quadtree grid), and R (random nodes in a unit square) are indicated in Figure 2.This programme displays a visualization of the basis function associated with a node and specificformulation, both picked by the user. Available formulations in two dimensions include Delaunayand polygonal interpolation schemes, MLS, and maximum-entropy approximations using a uniformprior distribution (lead to global basis functions) and a compactly supported prior distribution (leadto compactly supported basis functions). In one dimension, individual basis functions or all thebasis functions on a grid can be displayed. Two-dimensional basis functions may be dislayed ascontour plots or surface plots. The plots are generated by dividing the polygonal element intotriangles and then dividing the triangles recursively. The user can calculate the value of the basis



Figure 2. JAVA applet menu.



functions at an arbitrary point within the convex hull. In 3D, basis function values on planes thatcut the convex hull are computed.

The visualization package is coded as a Java 1.4 applet and embedded in an HTML page thatdetects screen resolution and adjusts the applet’s behaviour accordingly. The code is object-orientedand hence modular, making it fairly easily extended. Visualizations resemble as closely as possiblethe desirable styles of texts and journals, and software such as MathematicaTM and MatlabTM.The generated visualizations are suitable for publication (EPS option in Figure 2). The capabilitiesof the applet are demonstrated through basis function plots that appear in the ensuing sections.

4. MESHFREE APPROXIMANTS AND POLYGONAL INTERPOLANTS

Currently, most meshfree Galerkin methods are based on approximants that can be classifiedinto three distinct types: RBFs [8, 9], MLS approximants [5], and natural neighbour interpolationschemes [6, 7]. In meshfree methods, the approximation scheme is of the form Equation (1), butthe construction of the nodal basis functions {�i }ni=1 is not tied to a background element structure.A brief description of MLS, RBFs, and polygonal interpolants follows.

4.1. Moving least squares approximants

In the MLS approximation, each node is associated with a compactly supported weight function,w : [0,∞) → R:

wi (x)≡w(qi ), qi = ‖x − xi‖rmaxi

(7)

where ‖ · ‖ is the L2 norm of its argument, rmaxi is the radius of support for the nodal weight

function, and w(q) is a smooth, non-increasing weight function that is maximal at q = 0 andvanishes for q�1. The global MLS approximation is [5]:

uh(x)=m∑j=1

p j (x)a j (x) ≡n∑

i=1�i (x)ui (8)

where p is a basis vector (for example, p={1 x y}T is a linear basis in 2D) and a j (x) are unknownparameters that are found by solving a quadratic weighted least squares minimization problem [5]:

mina

1

2

n∑i=1

wi (x)[pT(xi )a(x) − ui ]2 or mina

1

2(Pa − u)TW(Pa − u) (9)

On carrying out the minimization, the solution for the MLS basis functions is given by [28]�i (x)= pT(x)A(x)−1Bi (x) (10a)

where the matrices A(x) and B(x) are

A(x) =n∑

i=1wi (x)p(xi )pT(xi ) (10b)

B(x) = [w1(x)p(x1), w2(x)p(x2), . . . , wn(x)p(xn)] (10c)



In the numerical implementation, a cubic spline weight function, w(q)∈C2(R+), is used [35]:

w(q)=

⎧⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎩

2

3− 4q2 + 4q3 if 0�q�1

2

4

3− 4q + 4q2 − 4q3

3if

1

2�q�1

0 otherwise

(11)

The nodal weight function support radius rmaxi = �hi , where hi is chosen for each node by a

procedure very similar to Algorithm 2 in Reference [36]:For the set of nodes {xi }ni=1 and their convex hull C,

1. Choose positive integersm1 andm2 and a constant ��0>1 and set hi = 0 for i = 1, 2, . . . , n.2. Assemble a set of points P in three steps:

(a) Create a [m1]d uniform grid of points over C and discard those falling outside.(b) Add m2 random, uniformly distributed points in C.(c) Add the nodes {xi }ni=1.

3. For each p∈P,

(a) Find the d+1 nodes {xi∗}d+1i∗=1 in {xi }ni=1 that are closest to p, and compute their Euclidean

distance di∗ = ‖p − xi∗‖.(b) If hi∗<di∗ , set hi∗ = di∗ .

4. Set rmaxi = �hi for i = 1, 2, . . . , n.

For the 2D applet, m1 = 100, m2 = 10 000, �0 = 1.01, and the constant � is chosen by the user viaa slider. This procedure is necessary to ensure that every point in C is covered by at least d + 1nodal weight function supports, so that the matrix A(x), defined in Equation (10b), has full rank.

4.2. Radial basis functions

Consider the approximation of a function u(x) : Rd → R using the set of scattered nodes {xi }ni=1.In the RBF approximation, a fixed radial function � : Rd → R is chosen, i.e. �(x)≡ �(‖x‖) with� : [0,∞) → R. On using translates of this radial function with centres at xi , the following ansatzis made [10]:

uh(x) =n∑

i=1�(‖x − xi‖)ai (12)

where ai are unknown coefficients. Often, a polynomial term is also included in the above approx-imation if global polynomial reproducibility is desired. For certain choices of �(·), for example,Gaussian, multiquadrics, or thin-plate splines, the matrix Ki j = �(‖x j − xi‖) is positive-definiteand invertible, and hence the data interpolation problem, uh(x j ) = u(x j ) ( j = 1, 2, . . . , n), resultsin a unique solution for a. The use of RBFs in collocation-based meshfree methods was initiatedby Kansa [37, 38], and new developments and advances continue to emerge in this topical researcharea. In this paper, RBFs are adopted as prior distributions (weights) within the Shannon–Jaynesmaximum-entropy formalism.



4.3. Polygonal interpolants

Using elements of projective geometry, Wachspress [39] proposed rational polynomial interpolantsfor convex polygons. Recently, there have been additional contributions on the construction ofbarycentric co-ordinates on irregular polygons [1, 2, 4, 11]. A review on the construction of polyg-onal interpolants is presented by Sukumar and Malsch [40].

In Reference [1], a simple expression is obtained for Wachspress’s basis functions:

�i (x)= wi (x)∑nj=1 w j (x)

, wi (x)= A(pi−1, pi , pi+1)

A(pi−1, pi , p)A(pi , pi+1, p)= cot �i + cot �i

‖x − xi‖2(13)

where the last expression is used in our numerical implementation. In the above equation, A(a, b, c)is the signed area of triangle [a, b, c], and �i and �i are shown in Figure 3(a).

Floater [2] used the mean value theorem for harmonic functions to develop barycentricco-ordinates on polygons. The linearly precise mean value co-ordinate is [2]:

�i (x) = wi (x)∑nj=1 w j (x)

, wi (x)= tan(�i−1/2) + tan(�i/2)

‖x − xi‖ (14)

where the angle �i is shown in Figure 3(b). In the JAVA applet, the algorithm for mean valueco-ordinates proposed by Hormann (Figure 6 in Reference [41]) is used. The implementation isvalid for convex and non-convex polygons.

Natural neighbour interpolation methods are Voronoi-based convex approximation schemes thatinterpolate nodal data and share many common properties with the finite element interpolant.Cueto and co-workers [42] provide an overview of the construction of natural neighbour-basedinterpolants. In Reference [4], Laplace basis functions [7] are constructed on regular polygons,and through an isoparametric mapping, the basis functions are defined on irregular polygons.The Wachspress basis functions and mean value co-ordinates are directly computed on irregularpolygons, which is also the case in a recently proposed non-conforming finite element methodon polyhedral meshes [43]. The interested reader can refer to Reference [40] and the referencestherein for further details on the construction and implementation of polygonal interpolants.

In Figure 4, Wachspress, mean value and Laplace basis functions for the hexagon in Figure 1(c)are plotted. These basis functions share the properties of polygonal barycentric co-ordinates.In Figure 5, the capabilities of the applet are further illustrated by presenting Laplace basis

iγδ i

p

p

pi+1

pi

(a) p

pi

pi+1

β i

γ

γi

β

pα

αi

(b)

Figure 3. Barycentric co-ordinates: (a) Wachspress [1]; and (b) mean value co-ordinates [2].



Figure 4. Hexagonal basis functions (node 1): (a) Wachspress; (b) meanvalue co-ordinates; and (c) Laplace.

Figure 5. Laplace basis functions (node 1) for a regular pentagon at varyingresolutions (a)–(d) and (e) 3D plot.

Figure 6. Mean value co-ordinates on concave polygons (node 1): (a) hexagon; and (b) octagon.



function contour plots on a regular pentagon for varying resolutions; a 3D perspective is shown inFigure 5(e). Mean value co-ordinates are also linearly precise on concave (non-convex) polygons. InFigure 6, contour plots of mean value co-ordinates are shown for a concave hexagon and a concaveoctagon.

5. BAYESIAN THEORY OF PROBABILITY AND ENTROPIC MEASURES

A recent development in the construction of meshfree approximants has been the use of information-theoretic variational principles [11–13]. To provide greater details and insights on the rationalefor this approach, we present some of the essential ingredients of Bayesian theory of probabil-ity and its ties to inductive inference. In References [11, 12], data approximation is viewed asa problem in inductive inference. Pure mathematics follows the principle of deductive logic—given a cause, many logical consequences can be readily inferred (Figure 7(a)). However, inscientific problems, the reverse is more common: given certain effects or observations, the mostlikely underlying causes are desired. This requires inductive logic (Figure 7(b)), as in ill-posedinverse problems (e.g. heat conduction, scattering, image reconstruction) that arise in science andengineering [45].

Probability theory as a rational inductive inference procedure was initiated by Bayes and Laplace,and subsequently formalized by Jeffreys [46] and Cox [47]. In information theory [48], the notionof entropy as a measure of uncertainty or incomplete knowledge was introduced by Shannon [14].Building on these previous contributions, Jaynes [15, 49] proposed the principle of maximum-entropy (MAXENT), in which it was shown that maximizing entropy provides the least-biasedstatistical inference when insufficient information is available. In References [11, 12], the basisfunctions {�i }ni=1 are viewed as a discrete probability distribution {pi }ni=1, and the polynomialreproducing conditions are the under-determined constraints. To regularize the ill-posed problem,the maximum-entropy principle was used. In this paper, as a generalization, the Shannon–Jaynesentropy functional and the MAXENT or minimum relative entropy principle [16–18] is invoked toobtain the basis functions. Sivia [44] presents an excellent introduction to Bayesian inference andmaximum-entropy methods, whereas Jaynes [50] provides a more rigorous and in-depth look atprobability theory from the Bayesian perspective.

In Bayesian theory, probability is a subjective measure that represents a degree-of-belief and isalways ‘conditional,’ which is contrary to the (objective) frequentist definition. The Bayesian viewconsists of three stages that are essential to the process of inductive inference [50–52]:

1. Bayes’s theorem: If h stands for a hypothesis, d for a set of data, and I for background(testable) information, then Bayes’s theorem states that:

p(h | d, I )︸︷︷︸posterior pdf

= p(h | I )︸︷︷︸prior

× p(d | h, I )︸︷︷︸likelihood

/ p(d | I )︸︷︷︸evidence

(15)

where p(·) is used to denote either the probability (discrete) or the probability density function,pdf (continuous), and in a parameter-estimation problem, the denominator is just a normalizingfactor since the posterior pdf must integrate to unity. In essence, Bayes’s theorem is a rule formanipulating probabilities and not for their assignment—the prior probability of h gets updatedto the posterior probability as a result of acquiring the data.



Cause Effects orOutcomes

(a)

Possible Effects orObservationsCauses

(b)

Figure 7. (a) Deductive logic; and (b) inductive logic [44].

2. Maximum-entropy principle: In information theory, Shannon introduced the notion of entropyas a measure of uncertainty [14]. The Shannon entropy of a discrete probability distribution is:

H(p) =E[− log p] = −n∑

i=1pi ln pi (16)

where pi ≡ p(xi ) is the probability of the occurrence of the event xi , p ln p.= 0 if p= 0, E[·]

is the expectation operator, and the above form of the entropy H(·) satisfies the axiomaticrequirements of an uncertainty measure, with (1) H(p)�0; (2) H(p) attains its maximum valuewhen p1 = p2 = · · · = pn = 1/n and it’s a monotonic function; and (3) H(p1, p2, . . . , pn) =H(p1, p2, . . . , pn, 0) being the most important properties [53].Entropy maximization was proposed by Jaynes [15] as a means for least-biased statistical

inference when insufficient information is available, and was shown to reproduce equilibrium(Gibbs–Boltzmann) and non-equilibrium distributions in statistical mechanics [16, 50, 54]. It is theonly consistent variational principle for the assignment of probabilities under a set of constraints(testable information) [16, 18]. Let the available data pertaining to a random variable X consistof the expected value of functions gr (x) (r = 0, 1, . . . ,m), with g0(x)= 1 being the normalizingcondition. Then, the discrete probabilities are found by solving [15]

maxp∈Rn+

(H(p) =−

n∑i=1

pi ln pi

)(17a)

n∑i=1

pi = 1,n∑

i=1pi gr (xi ) =E[gr (x)] (r = 1, 2, . . . ,m) (17b)

where Rn+ is the non-negative orthant. Often, the first m + 1 moments of the random variableare available, which leads to the classical maximum-entropy problem of moments [55]. For in-stance, if the mean � of a random variable X is known, then the discrete problem is posedas [44, p. 121]

maxp∈Rn+

(H(p) = −

n∑i=1

pi ln pi

)(18a)

n∑i=1

pi = 1,n∑

i=1pi xi = � (18b)



which is solved using the method of Lagrange multipliers, and in the continuous case with limits0 to ∞, we obtain the exponential distribution [44, p. 121]

p(x | �) = 1

�exp

(− x

�

), x�0 (19)

If the first moment � and variance �2 of X are known, then in the continuous case with limits ±∞,we obtain the solution [44, p. 122]

p(x | �, �) = 1

�√2�

exp

(− (x − �)2

2�2

)(20)

which is the Gaussian distribution—a consequence that follows if only the mean (first moment) andvariance of data are known. It was recognized that for H(·) to be invariant under the transformationy = f (x) in the continuous case, the general form of the entropy should be [16–18]

H(p,m) =−n∑

i=1pi ln

(pimi

)or H(p,m) = −

∫p(x) ln

(p(x)

m(x)

)dx (21)

where m(x)(mi ) is a prior distribution that estimates p(x)(pi ). In the literature, the quantityD(p‖m) =−H(p,m) is also referred to as the Kullback–Leibler distance (directed divergence) [56],and the variational principle is known as the principle of minimum relative (cross) entropy [18].The relative entropy, D(p‖m)�0, which is proven below.

ProofIf f is a concave function and X a random variable, then by Jensen’s inequality (see Reference[48, p. 25])

E[ f (X)]� f (E[X ]) (22)

The Shannon–Jaynes entropy (negative of the relative entropy) functional is

−D(p‖m) =−n∑

i=1pi ln

(pimi

)=

n∑i=1

pi ln

(mi

pi

)(23)

On considering the concave function f (x)= ln x and invoking Jensen’s inequality, we can write

−D(p‖m) =n∑

i=1pi ln

(mi

pi

)� ln

(n∑

i=1pimi

pi

)= ln

n∑i=1

mi = ln 1= 0 (24)

which completes the proof. �

Since ln is a strictly concave function, D(p‖m) attains its minimum value of zero if and only ifp=m. If a uniform prior, mi = 1/n, is used, the principle of minimum relative entropy is identicalto the MAXENT principle using Shannon entropy.

3. Hypothesis space: The choice of the hypothesis space is the key in any inductive inferenceproblem—this refers to the measure space to define m(x) or mi when using the maximum-entropyprinciple or the prior probability p(h | I ) in Bayes’s theorem. The selection of the prior distribution,m(x), is a key element in the construction of MAXENT approximation schemes, which is discussedin the next section.



5.1. Maximum-entropy approximation schemes

In References [11, 12], the maximum-entropy principle using Shannon entropy and a modifiedentropy functional, respectively, were used. In this paper, as a unifying framework and general-ization, we adopt the Shannon–Jaynes entropy measure, Equation (21), and for consistency, thevariational problem is posed as the maximization of the entropy functional, and therefore thedual (unconstrained) problem becomes a convex minimization problem. The parallels betweenthe conditions on �i in Equations (2) and (3) and those on pi in a MAXENT formulation areevident. Referring to the nodal sets shown in Figure 1, the basis function value �i (x) is viewedas the ‘probability of influence of a node i at x’. The maximum-entropy formulation is: find/(x)∈ Rn+ as the solution of the constrained optimization problem:

max/∈Rn+

(H(/,m) =−

n∑i=1

�i (x) ln(

�i (x)mi (x)

))(25a)

subject to the linear reproducing conditions given in Equation (2):

n∑i=1

�i (x)= 1,n∑

i=1�i (x)xi = x (25b)

where mi (x) is a prior estimate, and the constraints form an under-determined linear system. Lets (s = 0, 1, . . . , d) be the Lagrange multipliers associated with the d +1 constraints. The solutionof the variational problem can be written as

�i (x)= Zi (x)Z(x)

, Zi (x) =mi (x) exp(−xTi k(x)) (26)

where the maximum-entropy basis functions naturally assume an exponential form, and Z(x) =∑j Z j (x) is known as the partition function in statistical mechanics. In addition, xTi =[xi yi zi ]

and k(x)=[1(x) 2(x) 3(x)]T in three dimensions. We mention in passing that such exponential(Darmois–Koopman–Pitman) family of distributions are well known and widely studied in statis-tical theory [57] and information geometry [58].

The �i (x) in Equation (26) must satisfy the d linear constraints given in Equation (25b), whichyields d nonlinear equations. On considering the dual formulation, a simple unconstrained convexminimization problem is obtained. To this end, we let xi = xi − x , yi = yi − y, and zi = zi − z(shifted nodal co-ordinates in R3), and then redefine Z appropriately. Now, the dual problem is:find k such that [59, 60]

k= argmin ln Z(kt ) (27)

On using convex duality [19, 60], a detailed mathematical treatment of the primal and dualoptimization problems is presented in Reference [12]. Numerical algorithms such as steepestdescent, Newton’s method, quasi-Newton (variable metric) methods, and interior-point methodsare used to solve such unconstrained optimizations problems [60]. Interior-point methods are at-tractive for large systems with equality and inequality constraints [61]; for entropy maximization, aMatlabTM code, which is based on primal-dual interior method is available in the public-domain[62]. For the JAVA applet, the convex minimization problem is solved using a variable step sizegradient descent algorithm [63], with a convergence tolerance = 10−3.



In Reference [11], a uniform prior was used, whereas in Reference [12], a variational principleusing a modified entropy functional (pareto optima of two objectives) was proposed:

min/∈Rn+

M(/, x) or max/∈Rn+

−M(/, x), M(/, x) = �U (/, x) − H(/) (28a)

where �≡ �(x) is non-negative, H(/) is the Shannon entropy, andU (/, x) is the objective functionintroduced by Rajan [64]:

U (/, x) =n∑

i=1�i (x)‖xi − x‖2 (28b)

The above functional form draws the connection to statistical mechanics, with the free energyG=U − T H , where U is the internal energy, H is the entropy, and T is the temperature [12, 61].Rajan’s linear programming problem is [64]:

min/

U (/, x), �i (x)�0,n∑

i=1�i (x)= 1,

n∑i=1

�i (x)xi = x (29)

whose solution is the finite element (Delaunay) interpolant. At x= x j , nodal interpolation is realizedsince the minimum value U = 0 is attained if �i (x j ) = �i j . When � → ∞ in Equation (28a),Rajan’s problem is obtained, and �= 0 recovers Equation (25a) with a uniform prior. In Sukumar[13], the minimum relative entropy principle was used to unify the above developments. Theentropy functional considered by Arroyo and Ortiz [12] is obtained if a Gaussian (RBF) prior,mi (x) = exp(−�‖xi − x‖2), is used in Equation (25a):

H(/,m) = −n∑

i=1�i (x) ln

(�i (x)

exp(−�‖xi − x‖2))

= −n∑

i=1�i (x) ln�i (x) − �

n∑i=1

�i (x)‖xi − x‖2

= −�U (/, x) + H(/) (30)

which is identical to Equation (28a). The parameter � in the Gaussian distribution is inverselyproportional to the variance; it determines the support-width of the basis function [12].

The choice of the prior, mi (x), gives us greater flexibility in the construction of new approxi-mants, and provides a simple and appealing means to construct globally or compactly supportedconvex approximation schemes. In the spirit of previous research on meshfree methods [26, 27, 65]and partition of unity methods [66], given a prior mi (x) and a set of linear constraints (reproduc-ing conditions), entropy maximization can be viewed as a ‘correction’ to obtain an approxima-tion with polynomial and/or non-polynomial reproducibility. The use of the Shannon or relativeentropy functional provides a means to obtain the least-biased statistical inference solution. WithShannon entropy, the flattest possible distribution that is consistent with the constraints is realized.The maximum-entropy formulation leads to a convex optimization problem, with the approximantpossessing many desirable properties for the Galerkin solution of PDEs [12]. The continuity ofmaximum-entropy basis functions with a Gaussian prior is established in Arroyo and Ortiz [12],and in Sukumar and Wets [67], variational analysis and the theory of epi-convergence [68] is used



to prove the same for any prior distribution. The above properties are lost if other functionalsare adopted—for example, in Sukumar [11] it is shown that if the minimum-norm objective func-tional (leads to the generalized- or pseudo-inverse [69]) is used, then �i (x)<0 is also admissibleand interpolation on the boundary is not realized. If the minimum-norm objective is adoptedwith the non-negative condition, �i (x)�0, as additional constraints, then a convex approximant isobtained; however, numerical tests reveal that the basis functions are continuous but notcontinuously differentiable in the interior of C.

1. Uniform prior: For a uniform prior, mi (x) = 1/n, and as indicated earlier, the Shannon–Jaynes entropy functional is the same as the Shannon entropy (modulo a constant). For this case,the maximum-entropy basis functions are identical to bilinear finite element basis functions on asquare, and are smooth and bounded in C [11]. To illustrate a simple closed-form computation,consider one-dimensional approximation in C= [0, 1] with three nodes located at x1 = 0, x2 = 1

2 ,and x3 = 1. On using Equation (25), the solution for �i (x) is readily derived [40]:

�1(x)= 1

Z, �2(x)= �

Z, �3(x)= �2

Z, � ≡ �(x)= 2x − 1 + √

12x(1 − x) + 1

4(1 − x)(31)

where Z = 1 + � + �2.2. Non-uniform prior: Instead of a uniform prior, a non-uniform prior for node i gives more

weight to xi than to other nodal locations. Now, different choices of the prior mi (x) can be usedin the Shannon–Jaynes entropy functional:

• The prior can be selected to be global radial basis functions such as the Gaussian, mi (x) =exp(−‖xi − x‖2/c2), inverse multiquadrics, mi (x)= (‖xi − x‖2 + c2)−1/2, etc.

• On choosing a weight function, w(x), with compact support, we set mi (x)= wi (x) as theprior for node i , where wi (x) is a translation and scaling of w(x). If the only constraintis:

∑i �i (x)= 1, then the maximum-entropy basis functions are: �i (x) = wi (x)/

∑j w j (x),

which is the well-known Shepard function [70]. If wi (x) is constructed using the C2 cubicspline weight function given in Equation (11), then unlike the MLS approximant, a convexapproximant with desirable properties on the boundary is obtained. Other choices for mi (x)include compactly supported RBFs, for example the C2 function m(r) = (1 − r)4+(4r + 1)[10], where (·)+ = (·) if the argument is non-negative and zero otherwise.

• As alternative compactly supported priors, R-functions [71, 72] or implicit (level set) functionsthat are defined on a graph are also suitable.

Maximum-entropy basis functions with a uniform prior in C= [0, 1] are depicted in Figure 8.For the plots in Figure 8(a), the closed-form expressions for �i (x) are given in Equation (31). Nodalinterpolation is met on bdry C but not at the interior nodes. In Figure 9, the maximum-entropybasis functions for nodes 1 and 6 in Figure 1(d) are illustrated. We note that �1(x) is unity at x1and is piece-wise linear on the boundary, whereas �6(x6) �= 1 and �6(x) vanishes on the boundaryof the square. Laplace and MAXENT basis functions on a weakly convex polygon are shown inFigure 10. Along the edge 1–2, Laplace basis functions satisfy the Kronecker-delta property butthe maximum-entropy basis functions do not; however, �i (x)= 0 (i = 3–5) along edge 1–2 forboth Laplace and MAXENT basis functions.

To demonstrate the properties of convex approximants with a non-uniform prior, we first considera one-dimensional grid. In Figure 11, basis function plots using MLS, and MAXENT with acompactly supported prior are presented. The cubic spline weight function given in Equation (11)



(a) (b) (c)

Figure 8. One-dimensional maximum-entropy basis functions with a uniform prior: (a) regular grid (n = 3);(b) regular grid (n = 5); and (c) random grid (n = 5).

Figure 9. Two-dimensional maximum-entropy basis functions with auniform prior: (a) �1(x); and (b) �6(x).

Figure 10. Basis functions on a weakly convex pentagon: (a) Laplace (node 6); (b) maximum-entropy(node 6); (c) Laplace (node 4); and (d) maximum-entropy (node 4).



(a) (b)

(c) (d)

Figure 11. One-dimensional basis functions: (a,b) uniform grid with MLS (�= 2.5)and maximum-entropy with a cubic spline prior (�= 2.5); and (c,d) random grid with MLS (�= 2.5)

and maximum-entropy with a cubic spline prior (�= 2.5).

is used as the compactly supported prior. We observe that interior MLS basis functions have anon-zero contribution on the boundary (Figures 11(a) and (c)), whereas boundary MAXENT basisfunctions with a cubic spline prior (Figures 11(b) and (d)) satisfy the Kronecker-delta property.Next, we consider the two-dimensional grid shown in Figure 1(d), and study the MAXENT plotsusing a Gaussian prior, when � is varied; see Reference [12] for its applications in nonlinear solidmechanics. For �= 0, 1, 10, 100, the MAXENT basis function plots for node 8 are presented inFigure 12. The value �= 0 corresponds to a uniform prior. It’s observed that as � is increasedthe nodal basis function support shrinks, and when � = 100 (theoretically when � → ∞) the basisfunction support is proximal to the triangular (Delaunay) basis function (Figure 12(d)). In Figure 13,comparisons between the MLS basis function and the MAXENT basis function using the compactlysupported cubic spline prior are presented. The interior MLS basis function is non-zero on bdry C(Figure 13(c)), whereas the interior MAXENT basis function vanishes on the boundary of the square(Figure 13(d)).

As of this writing, the three-dimensional applet is somewhat less developed than the other twoand is restricted to maximum-entropy basis function plots with a uniform prior. For Figure 14,a regular tetrahedron is created, along with one or two interior nodes. Basis functions are plottedalong planes that cut the convex hull.



Figure 12. Maximum-entropy basis function, �8(x), with a Gaussian prior:(a) � = 0; (b) � = 1; (c) � = 10; and (d) �= 100.

5.2. Higher-order approximation schemes

In References [11, 12], linearly complete approximations were constructed using the maximum-entropy principle. Furthermore, in Reference [12], it was shown that the additional constraint∑

i �i (x)x2i = x2 + c in one-dimension with c= 0 does not yield a feasible solution if �i�0.

On choosing c �= 0, non-negative �i (x) can be obtained, which bear resemblance to univariateB-splines [12]. Alternatively, the non-negative condition, �i (x)�0, can be relaxed to obtain anappropriate ‘entropy functional’ that can be maximized. To this end, we start with the generalizationof the Shannon–Jaynes entropy that was proposed by Skilling [73]:

H(/,m) =n∑

i=1

[�i − mi − �i ln

(�i

mi

)](32)

where �i ≡�i (x) and the prior estimate mi ≡mi (x) need not be normalized so that applicabilityis extended to physical distributions other than probabilities. In the absence of any constraints,H(· , ·) is maximized when �i =mi (i = 1, 2, . . . , n). On using the above expression, distributionswith positive and negative values (signed basis functions) are obtained.

Let �i = vi −wi , where vi ∈ R+ and wi ∈ R+, so that �i ∈ R. Also, let mvi and mw

i be the priorestimate for vi and wi , respectively. The total entropy is

H(v,w,mv,mw) =n∑

i=1

[vi − mv

i − vi ln

(vi

mvi

)]+

n∑i=1

[wi − mw

i − wi ln

(wi

mwi

)](33)



Figure 13. Basis functions: �3(x) with: (a) MLS; (b) maximum-entropy with the cubic spline prior;and �8(x) with (c) MLS; and (d) maximum-entropy with the cubic spline prior.

(a) (b) (c)

Figure 14. Three-dimensional maximum-entropy basis functions plots: (a) �1(x) along a plane near nodes 1and 5; (b) �5(x) along a plane containing node 5; and (c) �3(x) along a plane near nodes 1, 3, 5, and 6.

An expression for H in which only / and m appears is desired. Since �i = vi − wi , we have

�H�vi

= �H��i

��i

�vi= �H

��i,

�H�wi

= �H��i

��i

�wi= − �H

��i(34)

and therefore�H�vi

+ �H�wi

= 0 (35)



On using Equation (33) and the above relation, we obtain

viwi =mvi m

wi (36)

If vi = ( i +�i )/2 and wi = ( i −�i )/2, then i =√

�2i + 4mv

i mwi . Finally, the entropy expression

for the positive/negative distribution / is [74, 75]:

H(/,mv,mw) =n∑

i=1

[ i − mv

i − mwi − �i ln

( i + �i

2mvi

)](37)

Now, when no constraints are imposed, H is maximized when �i =mvi − mw

i . For the dataapproximation problem, we choose mv

i = 2mi and mwi =mi , where mi ≡mi (x) is a non-negative

weight function. The expression for the entropy becomes

H(/,m) =n∑

i=1

[ i − 3mi − �i ln

( i + �i

4mi

)](38)

where i =√

�2i + 8m2

i . On using the above form of H within the maximum-entropy variationalprinciple, signed basis functions with higher-order completeness are constructed.

The implementation of the signed maximum-entropy approximant has been carried in MatlabTM

[62]. In Figure 15, the MAXENT basis functions using a Gaussian prior weight function are shown.The domain is C= [0, 1], which is discretized by five equi-spaced nodes. In Figure 15(a)–(c),quadratically complete basis functions are depicted for varying values of �, whereas inFigure 15(d)–(f), basis functions with cubic complete basis functions are plotted for different �.The plots in Figure 15(g)–(i) are for basis functions that can reproduce {1, x, f (x)}, wheref (x)= exp(−(x − 0.5)2). In all cases, as � is increased, the basis functions are less negativeand are also closer to being an interpolant on the boundary.

6. CONCLUDING REMARKS

In this paper, we presented an overview and recent advances in the construction of meshfreeapproximation schemes. Meshfree basis functions such as MLS approximants, natural neighbour-based polygonal interpolants, and maximum-entropy (MAXENT) approximants were considered.The construction and applications of MAXENT approximants have recently come to the forefront[11–13], and hence greater emphasis was placed on the theoretical underpinnings of Bayesiantheory of probability, maximum-entropy principle [14, 15], and its numerical solution. We used theShannon–Jaynes entropy functional or relative entropy [17, 18] within the variational formulationto generalize the construction of MAXENT approximants. The merits of constructing basis functionsusing the maximum-entropy variational principle were examined, and the extension of Shannon–Jaynes entropy to physical distributions other than probabilities [73] was used to construct higher-order maximum-entropy basis functions. The use of maximum-entropy approximation schemes inhigher-dimensional parameter spaces is also promising [61]. A JAVA applet was developed,‡ and

‡Access to the applet will be made available through the first author’s web page.



0 0.25 0.5 0.75 1

0

0.5

1

1.5

x

φφ1 φ2 φ3 φ4 φ5 φ1 φ2 φ3 φ4 φ5

φ1 φ2 φ3 φ4 φ5

φ1 φ2 φ3 φ4 φ5

φ1 φ2 φ3 φ4 φ5

φ1 φ2 φ3 φ4 φ5

(a)0 0.25 0.5 0.75 1

0

0.5

1

1.5

x

φ

(b)

0 0.25 0.5 0.75 1

0

0.5

−0.5

−0.5 −0.5

−0.5

−0.5−0.5

1

1.5

x

φ

(c)0 0.25 0.5 0.75 1

0

0.5

1

x

φ

(d)

0 0.25 0.5 0.75 1

0

0.5

1

x

φ

(e)0 0.25 0.5 0.75 1

0

0.5

1

1.5

x

φ

(f)

Figure 15. Higher-order basis functions: (a)–(c) quadratic completeness using a Gaussian prior with� = 0, 2, 5; (d)–(f ) cubic completeness using a Gaussian prior with �= 0, 2, 5; and (g)–(i) reproducing

the functions {1, x, exp(−(x − 0.5)2)} using a Gaussian prior with � = 0, 2, 5.

basis function plots were presented to reveal the similarities and distinctions between differentmeshfree approximants. The maximum-entropy formulation with a non-uniform prior providesa simple and elegant means to directly impose linear essential boundary conditions in meshfreemethods. With the development of stable nodal integration schemes for meshfree Galerkin methods,



0 0.25 0.5 0.75 1

0

0.5

1

1.5

x

φφ1

φ2

φ3

φ4

φ5

(g)0 0.25 0.5 0.75 1

0

0.5

1

1.5

x

φ

φ1

φ2

φ3

φ4

φ5

(h)

0 0.25 0.5 0.75 1

0

0.5

−0.5

−0.5−0.5

1

1.5

x

φ

φ1

φ2

φ3

φ4

φ5

(i)

Figure 15. Continued.

background cells would no longer be needed for numerical integration. This advance would pave theway towards the conception of stable meshfree particle methods, which are particularly attractivefor the solution of problems that arise in nonlinear solid mechanics.

ACKNOWLEDGEMENTS

The authors are grateful for the research support of the National Science Foundation through contractCMS-0626481 to the University of California, Davis. The financial support to RWW through a NSFVIGRE graduate trainee award (NSF Grant DMS-0135345) is also acknowledged. NS thanks Roger Wetsand Michael Saunders for many helpful discussions.

REFERENCES

1. Meyer M, Lee H, Barr A, Desbrun M. Generalized barycentric coordinates on irregular polygons. Journal ofGraphics Tools 2002; 7(1):13–22.

2. Floater MS. Mean value coordinates. Computer Aided Geometric Design 2003; 20(1):19–27.3. Floater MS, Hormann K. Surface parameterization: a tutorial and survey. In Advances in Multiresolution for

Geometric Modelling, Dodgson NA, Floater MS, Sabin MA (eds), Mathematics and Visualization. Springer:Berlin, Heidelberg, 2005; 157–186.

4. Sukumar N, Tabarraei A. Conforming polygonal finite elements. International Journal for Numerical Methodsin Engineering 2004; 61(12):2045–2066.



5. Lancaster P, Salkauskas K. Surfaces generated by moving least squares methods. Mathematics of Computation1981; 37:141–158.

6. Sibson R. A vector identity for the Dirichlet tesselation. Mathematical Proceedings of the Cambridge PhilosophicalSociety 1980; 87:151–155.

7. Christ NH, Friedberg R, Lee TD. Weights of links and plaquettes in a random lattice. Nuclear Physics B 1982;210(3):337–346.

8. Hardy RL. Multiquadric equations of topography and other irregular surfaces. Journal of Geophysical Research1971; 76:1905–1915.

9. Buhmann MD. Radial Basis Functions: Theory and Implementations. Cambridge University Press: Cambridge,U.K., 2003.

10. Wendland H. Scattered Data Approximation. Cambridge University Press: Cambridge, U.K., 2005.11. Sukumar N. Construction of polygonal interpolants: a maximum entropy approach. International Journal for

Numerical Methods in Engineering 2004; 61(12):2159–2181.12. Arroyo M, Ortiz M. Local maximum-entropy approximation schemes: a seamless bridge between finite elements

and meshfree methods. International Journal for Numerical Methods in Engineering 2006; 65(13):2167–2202.13. Sukumar N. Maximum entropy approximation. AIP Conference Proceedings 2005; 803(1):337–344.14. Shannon CE. A mathematical theory of communication. The Bell Systems Technical Journal 1948; 27:379–423.15. Jaynes ET. Information theory and statistical mechanics. Physical Review 1957; 106(4):620–630.16. Jaynes ET. Information theory and statistical mechanics. In Statistical Physics: The 1962 Brandeis Lectures,

Ford K (ed.). W. A. Benjamin: New York, 1963; 181–218.17. Kullback S. Information Theory and Statistics. Wiley: New York, NY, 1959.18. Shore JE, Johnson RW. Axiomatic derivation of the principle of maximum entropy and the principle of minimum

cross-entropy. IEEE Transactions on Information Theory 1980; 26(1):26–36.19. Rockafellar RT. Convex Analysis. Princeton University Press: Princeton, NJ, 1970.20. Strang G, Fix G. An Analysis of the Finite Element Method. Prentice-Hall: Englewood Cliffs, NJ, 1973.21. Epperson JF. On the Runge example. The American Mathematical Monthly 1987; 94(4):329–341.22. Farouki RT, Goodman TNT. On the optimal stability of the Bernstein basis. Mathematics of Computation 1996;

65(216):1553–1566.23. Farouki RT. Private communication, 2006.24. Pena JM. B-splines and optimal stability. Mathematics of Computation 1997; 66(220):1555–1560.25. Hughes TJR, Cottrell JA, Bazilevs Y. Isogeometric analysis: CAD, finite elements, NURBS, exact geometry and

mesh refinement. Computer Methods in Applied Mechanics and Engineering 2005; 193(39–41):4135–4195.26. Belytschko T, Lu YY, Gu L. Element-free Galerkin methods. International Journal for Numerical Methods in

Engineering 1994; 37:229–256.27. Liu WK, Jun S, Zhang YF. Reproducing kernel particle methods. International Journal for Numerical Methods

in Engineering 1995; 20:1081–1106.28. Belytschko T, Krongauz Y, Organ D, Fleming M, Krysl P. Meshless methods: an overview and recent developments.

Computer Methods in Applied Mechanics and Engineering 1996; 139:3–47.29. Li S, Liu WK. Meshfree and particle methods and their applications. Applied Mechanics Review 2002; 55(1):1–34.30. Huerta A, Belytschko T, Fernandez-Mendez S, Rabczuk T. Meshfree methods. In Encyclopedia of Computational

Mechanics, Stein E, de Borst R, Hughes TJR (eds), vol. 1. Wiley: Chichester, 2004; 279–309 (Chapter 10).31. Fries TP, Matthies HG. Classification and overview of meshfree methods. Technical Report Informatikbericht-Nr.

2003-03, Institute of Scientific Computing, Technical University Braunschweig, Braunschweig, Germany, 2004.32. Atluri SN, Shen S. The Meshless Local Petrov–Galerkin (MLPG) Method. Tech Science Press: Encino, CA,

2002.33. Liu GR. Mesh Free Methods: Moving Beyond the Finite Element Method. CRC Press: Boca Raton, FL, 2003.34. Li S, Liu WK. Meshfree Particle Methods. Springer: New York, NY, 2004.35. Dolbow J, Belytschko T. An introduction to programming the meshless Element Free Galerkin method. Archives

of Computational Methods in Engineering 1998; 5(3):207–241.36. Du Q, Gunzburger M, Ju L. Meshfree, probabilistic determination of point sets and support regions for meshless

computing. Computer Methods in Applied Mechanics and Engineering 2002; 191:1349–1366.37. Kansa EJ. Multiquadrics—a scattered data approximation scheme for applications to computational fluid-dynamics.

1. Surface approximations and partial derivative estimates. Computers and Mathematics with Applications 1990;19(8/9):127–145.



38. Kansa EJ. Multiquadrics—a scattered data approximation scheme for applications to computational fluid-dynamics.2. Solutions to parabolic, hyperbolic and elliptic partial-differential equations. Computers and Mathematics withApplications 1990; 19(8/9):147–161.

39. Wachspress EL. A Rational Finite Element Basis. Academic Press: New York, NY, 1975.40. Sukumar N, Malsch EA. Recent advances in the construction of polygonal finite element interpolants. Archives

of Computational Methods in Engineering 2006; 13(1):129–163.41. Hormann K. Barycentric coordinates for arbitrary polygons in the plane. Technical Report IfI-05-05, Department

of Informatics, Clausthal University of Technology, February 2005.42. Cueto E, Sukumar N, Calvo B, Martınez MA, Cegonino J, Doblare M. Overview and recent advances in natural

neighbour Galerkin methods. Archives of Computational Methods in Engineering 2003; 10(4):307–384.43. Rashid MM, Selimotic M. A three-dimensional finite element method with arbitrary polyhedral elements.

International Journal for Numerical Methods in Engineering 2006; 67(2):226–252.44. Sivia DS. Data Analysis: A Bayesian Tutorial. Oxford University Press: Oxford, 1996.45. Kapur JN. Maximum-Entropy Models in Science and Engineering. (1st rev. edn). Wiley: New Delhi, India, 1993.46. Jeffreys H. Theory of Probability. Clarendon Press: Oxford, 1939.47. Cox RT. Probability, frequency and reasonable expectation. American Journal of Physics 1946; 14:1–13.48. Cover TM, Thomas JA. Elements of Information Theory. Wiley: New York, NY, 1991.49. Jaynes ET. Information theory and statistical mechanics. II. Physical Review 1957; 108(2):171–190.50. Jaynes ET. Probability Theory: The Logic of Science. Cambridge University Press: Cambridge, U.K., 2003.51. Jaynes ET. The relation of Bayesian and maximum entropy methods. In Maximum-Entropy and Bayesian Methods

in Science and Engineering, Erickson GJ, Smith CR (eds), Foundations, vol. 1. Dordrecht, The Netherlands,1988; 25–29.

52. Gull SF. Bayesian inductive inference and maximum entropy. In Maximum-Entropy and Bayesian Methods inScience and Engineering, Erickson GJ, Smith CR (eds), Foundations, vol. 1. Dordrecht, The Netherlands, 1988;53–74.

53. Khinchin A. Mathematical Foundations of Information Theory. Dover: New York, NY, 1957.54. Rosenkrantz RD (ed.). E. T. Jaynes: Paper on Probability, Statistics and Statistical Physics. Kluwer Academic

Publishers: Dordrecht, The Netherlands, 1989.55. Mead LR, Papanicolaou N. Maximum-entropy in the problem of moments. Journal of Mathematical Physics

1984; 25(8):2404–2417.56. Kullback S, Leibler RA. On information and sufficiency. Annals of Mathematical Statistics 1951; 22(1):79–86.57. Barndorff-Nielsen O. Information and Exponential Families in Statistical Theory. Wiley: New York, NY, 1978.58. Amari S, Nagaoka H. Methods of Information Geometry. Oxford University Press: New York, NY, 2000.59. Agmon N, Alhassid Y, Levine RD. An algorithm for finding the distribution of maximal entropy. Journal of

Computational Physics 1979; 30:250–258.60. Boyd S, Vandenberghe L. Convex Optimization. Cambridge University Press: Cambridge, U.K., 2004.61. Gupta MR. An information theory approach to supervised learning. Ph.D. Thesis, Department of Electrical

Engineering, Stanford University, Palo Alto, CA, U.S.A., March 2003.62. Saunders MA. PDCO: primal-dual method for convex objectives. Available at http://www.stanford.edu/

group/SOL/software/pdco.html, Department of Management Science and Engineering, Stanford University,Stanford, CA, 2002.

63. Burden RL, Faires JD. Numerical Analysis (10th edn). Thomson/Brooks/Cole: Belmont, CA, 2005.64. Rajan VT. Optimality by the Delaunay triangulation in Rd . Discrete and Computational Geometry 1994;

12(2):189–202.65. Krongauz Y, Belytschko T. Consistent pseudo-derivatives in meshless methods. Computer Methods in Applied

Mechanics and Engineering 1997; 146(3–4):371–396.66. Babuska I, Melenk JM. The partition of unity method. International Journal for Numerical Methods in Engineering

1997; 40:727–758.67. Sukumar N, Wets RJ-B. Deriving the continuity of maximum-entropy basis functions via variational analysis.

2006, submitted.68. Rockafellar RT, Wets RJ-B. Variational Analysis (2nd edn). Springer: Berlin, 2004.69. Penrose R. A generalized inverse for matrices. Proceedings of the Cambridge Philosophical Society 1955;

51:406–413.70. Shepard D. A two-dimensional interpolation function for irregularly spaced points. ACM National Conference,

1968; 517–524.



71. Shapiro V. Theory of R-functions and applications: a primer. Technical Report CPA88-3, Cornell ProgrammableAutomation, Sibley School of Mechanical Engineering, Ithaca, NY 14853, 1991.

72. Rvachev VL, Sheiko TI, Shapiro V, Tsukanov I. On completeness of RFM solution structures. ComputationalMechanics 2000; 25(2–3):305–316.

73. Skilling J. The axioms of maximum entropy. In Maximum-Entropy and Bayesian Methods in Science andEngineering, Erickson GJ, Smith CR (eds), Foundations, vol. 1. Dordrecht, The Netherlands, 1988; 173–187.

74. Gull SF, Skilling J. The MEMSYS5 users’ manual. Technical Report, Maximum Entropy Data Consultants Ltd.,Suffolk, U.K., 1990.

75. Hobson MP, Lasenby AN. The entropic prior for distributions with positive and negative values. Monthly Noticesof the Royal Astronomical Society 1998; 298(3):905–908.


Documents

Overviewandconstructionofmeshfreebasisfunctions:From ...dilbert.engr.ucdavis.edu/~suku/maxent/papers/maxent...and ﬁnite element and meshfree methods to name a few. In this paper,