Computational modeling and experimental …yililiu/Bauerly-Liu-IJHCS-2006.pdfInt. J. Human-Computer Studies 64 (2006) 670–682 Computational modeling and experimental investigation

ARTICLE IN PRESS

1071-5819/$ - se

doi:10.1016/j.ijh

�CorrespondE-mail addr

[email protected]

Int. J. Human-Computer Studies 64 (2006) 670–682

www.elsevier.com/locate/ijhcs

Computational modeling and experimental investigation of effects ofcompositional elements on interface and design aesthetics

Michael Bauerly�, Yili Liu

Department of Industrial and Operations Engineering, The University of Michigan, 1205 Beal Avenue, Ann Arbor, MI 48109-2117, USA

Received 17 May 2005; received in revised form 15 December 2005; accepted 24 January 2006

Available online 7 March 2006

Communicated by J. Scholtz

Abstract

This article describes computational modeling and two corresponding experimental investigations of the effects of symmetry, balance

and quantity of construction elements on interface aesthetic judgments. In the first experiment, 30 black and white geometric images were

developed by systematically varying these three attributes in order to validate computational aesthetic quantification algorithms with

subject ratings. The second experiment employed the same image layout as Experiment 1 but with realistic looking web pages as stimuli.

The images were rated by 16 subjects in each experiment using the ratio-scale magnitude estimation method against a benchmark image

with average balance and symmetry values and a standard number of elements. Subjects also established an ordered list of the images

according to their aesthetic appeal using the Balanced-Incomplete-Block (BIB) ranking method.

Results from both experiments show that subjects are adept at judging symmetry and balance in both the horizontal and vertical

directions and thus the quantification of those attributes is justified. The first experiment establishes a relationship between a higher

symmetry value and aesthetic appeal for the basic imagery showing that subjects preferred symmetric over non-symmetric images. The

second experiment illustrates that increasing the number of groups in a web page causes a decrease in the aesthetic appeal rating.

r 2006 Elsevier Ltd. All rights reserved.

Keywords: Aesthetics; Engineering aesthetics; Balance; Symmetry; Display evaluation

1. Introduction

Research on visual displays has traditionally defineddisplay effectiveness with criteria such as legibility ordifficulty of target search and information access. Mostof this research focuses on qualitative descriptionand summarization and does not utilize a quantitativemathematical modeling approach. Because the majorityof human factors design guidelines are qualitative, theeffectiveness of many design techniques is left to debatebecause there are no methods to provide numerical analysisor direct comparison between different design proposals.

Tullis (1988b), for example, gives many well-defined butqualitative guidelines for text-based screen design. Most ofthese principles can be transferred to many types of visual

e front matter r 2006 Elsevier Ltd. All rights reserved.

cs.2006.01.002

ing author.

esses: [email protected] (M. Bauerly),

du (Y. Liu).

bitmapped displays, which prove to be extremely helpful inmany design situations. One major drawback, however, isthat they do not provide a way for the designer to assignquantified values to the specific components of screendesign.There are some past attempts to adopt a quantitative

and computational approach to interface design. Forexample, Tullis (1983) developed metrics for quantifyingthe effects of item grouping, density and complexity on theusability of text-based displays and later tested thoseattributes against measures of search time and preference(Tullis, 1984, 1988a). Streveler and Wasserman (1984)propose creating several classes of screen measures foralphanumeric displays including the aesthetic measures ofbalance and symmetry. One of the overlapping attributes inthese two lines of research is the use of characteristics suchas size or number of groups as a descriptive metric. Liu andWickens (1992) use cluster analysis to achieve visualgrouping and encoding of data in a two-dimensional

www.elsevier.com/locater/ijhcs

ARTICLE IN PRESSM. Bauerly, Y. Liu / Int. J. Human-Computer Studies 64 (2006) 670–682 671

(2-D) grid according to quantitative similarity values.Subjects were then asked to complete various judgmenttasks with the display grids and their judgment perfor-mance showed significant improvement with the use of thedisplay grids. Sears (1993) developed a metric for develop-ing and comparing user interface widget layout based on asimple description of the stages required to complete a taskwithin the interface.

In addition to the research challenge of developingquantitative metrics for interface evaluation and analysis,there is an issue of integrating aesthetic factors in interfaceevaluation. Consideration of aesthetics has largely beenignored in human factors analysis of displays until somerecent work that has emerged on the increasing importanceof aesthetics in various domains of contemporary society.Liu (2003a, b) provided a comprehensive review of themajor schools of aesthetic theory and their relationship toaesthetic design and human factors engineering. Jordan(1997) calls for the design of products that provideparticular aesthetic pleasure beyond simple usability andthe associated positive feelings of security, confidence,pride and satisfaction. Hallnas and Redstrom (2002)declare that increasing the aesthetics of the computationalinterface will only aid in the widespread acceptance and‘presence’ of ubiquitous computing devices. A recentsurvey by Kim et al. (2003) analyzes several web pagesto determine the common aesthetic design factorsand the corresponding emotional responses of users.Another study by Healey and Enns (2002) manipulatesvisualization techniques in order to provide guidelines fordesigning effective visual displays by encoding weatherdata in a map of the US. This process optimized aestheticappeal and information retrieval through the creation ofdifferent presentation techniques to find an optimalstrategy. The relation of aesthetics to system effectivenesscannot be ignored. Tractinsky et al. (2000) concludedthat users of an automated teller machine (ATM)found the system to be more usable based solely onaesthetic alterations to the interface without any changes infunctionality.

In addition to this recent interest from many areas,artists and designers have long treated aesthetics as aprimary aspect of their work; however, they mainlydescribe aesthetic terms in qualitative or subjectivelanguages that do not easily allow for engineeringimplementation. Recently, several artists have started toinclude themes of a computationally generated superioraesthetic. One of the earliest groups to utilize computationwas Group de Recherche d’Art Visuel (GRAV) of the1960s. Prince (2000) notes that this group was ‘‘dedicatedto understanding mathematical simulation and aesthetics’’and served as pioneers in this artistic field. The use of anexploratory approach that utilized the most appropriatesolution in future iterations is one that would appear herefor the first time computationally. Another contemporaryartist making similar efforts is Steven Rooke. His artutilizes a genetic process of aesthetic selection such that

aesthetically interesting features emerge in subsequentiterations of the design process as recorded by World(1996). At each iteration, certain generations are givenhigher aesthetic fitness scores which are passed on to futuregenerations, creating an aesthetically superior set ofoffspring. The aesthetic fitness, however, relies on thejudgment of the artist to assign a score and thus the processremains highly subjective.Reiser and Reiser (1995) create a list of aesthetic

considerations specific to multimedia; but the end resultis one that only extends the qualitative human factorschecklist to include more items to consider withoutknowing quantitatively where an optimal design spaceexists.This paper describes our computational modeling and

experimental research work that attempts to bridge thescientific methods of human performance and displayanalysis with aesthetic design principles. This is executed ina quantitative manner through the development andvalidation of numerical quantifications of the effects ofthree compositional elements on aesthetic judgments. Thethree elements—symmetry, balance and compositionalblocking—are present in 2-D medium.Inspiration is taken from the development of methods

for quantifying the grouping, density and complexity oftext-based displays by Tullis (1983). Relatively similarattempts have been made using many more attributes thanwhat are presented here (Ngo, 2001; Ngo et al., 2003; Lavieand Tractinsky, 2004). These earlier studies, however, havenot been validated with any experimental investigation ofuser judgment of the proposed quantification methods. It isimportant that we develop quantitative methods thatmatch the perceptual and mental processesof system users, which creates the necessity of humanexperimental verification.While the objective of the present study is to develop

quantitative indexes for interface aesthetic evaluation, notfor constructing or assessing a theoretical model, thepresent study is inspired by the psychophysical school ofaesthetic theories. As discussed in detail in Liu (2003a),several major schools of aesthetic theories exist, includingphilosophical theories, cognitive and social theories,natural and sexual selection theories and psychophysicaltheories. Among these schools of theories, psychophysicaltheories emphasize investigations of quantitative relation-ships between aesthetic response and basic pictorialelements, which is an approach adopted by the presentstudy. The results of the present study, in return, providean evaluation of the role of this school of theories inaesthetic interface design and evaluation.The compositional elements chosen in this study

represent three basic pictorial elements and design conceptsthat follow from previous research. The element of balanceis well understood by experts and non-experts alike and thepreference for balance in visual displays is well documented(Wilson and Chatterjee, 2005). Existing theories suggestthat visual balance is necessary because it unifies the

ARTICLE IN PRESSM. Bauerly, Y. Liu / Int. J. Human-Computer Studies 64 (2006) 670–682672

elements of a display into a cohesive whole thus creatingintegrity and meaning (Locher et al., 1998). Pre-attentivevisual processing of balance can be accomplished within100ms (Ognjenovic, 1991; Locher and Nagy, 1996) andthus can quickly help to structure and guide the viewer’sgaze through an image (Locher, 1996).

Subject preferences for symmetry also exist for the samereason as those for balance. Any composition with perfectsymmetry is by definition perfectly balanced and thus itserves as an element that can be pre-attentively processedand serves as a guide for the viewer. The preference forsymmetry is also grounded in sexual selection theories. Forexample, cross-cultural preferences for symmetrical facesmay be explained by evolutionary processes that favorsymmetrical facial features, among other things (Langloisand Roggman, 1990). Additionally, symmetric faces mayreveal a higher level of ability to resist parasites and mayhave indicated a stronger hunter in males. In females, manysymmetrical body and facial features that are consideredattractive may be indicators of higher fertility levels (Bussand Barnes, 1986).

The number of design elements or visual groups is alsorelated to aesthetic appraisal. Depending on the numberand size of the ‘compositional building blocks,’ a displaycan appear empty and sparse or dense and overcrowded. Ahigh information density may cause perceptual channels tobecome overloaded (Cropper and Evans, 1968). Tullis(1983) summarizes the recommendations for an appro-priate level of density for alphanumeric displays but hepoints out that it is not always possible to convert from onedensity measure to another. This is particularly true whenusing density as a measurement in modern bitmappeddisplays.

The role of symmetry, grouping and balance in visualperception has long been recognized by the Gestalttheory of perceptual organization. ‘‘Gestalt’’ meansform or shape and, particularly, ‘‘good form’’ or ‘‘goodshape’’ that emerges when the parts of a perceivedobject are grouped to form the perceptual whole (Boring,1950). Symmetry, grouping and balance are among thenumerous Gestalt principles that have been proposed todescribe how the perceptual elements are groupedinto recognizable whole objects or ‘‘good forms.’’ Accord-ing to the Gestalt principles, the more symmetrical aregion’s shape, the more likely it is seen as a figure incontrast to its background. Similarly, visual patternsgrouped together by similarity or proximity tend to beseen as a whole figure and visual arrangements that aremore uniform and homogeneous tend to be perceived as‘‘good forms.’’

These existing theories and research illustrate theimportance of balance, symmetry and grouping onperception and preference. However, none of theseprevious studies has adopted a mathematical approachto quantify the joint effects of these variables onuser’s aesthetic judgment, which is the focus of thepresent study.

1.1. Introduction to the experimental procedure

Two experiments are reported in this paper whosepurpose is twofold: to determine whether the metrics ofthree interface compositional elements (symmetry, balanceand the number of compositional groups) reflect subjectratings of those attributes and to determine whether thereis a relationship between the compositional elements andsubject ratings of aesthetic appeal. Numerical valuesrepresenting the three compositional elements are calcu-lated for two different types of stimuli in the twoexperiments. In Experiment 1, basic black and whitegeometric images were used and in Experiment 2, webpages following the exact same compositions as the imagesfrom Experiment 1 were used to represent a real-worldinterface situation.The three visual elements of balance, symmetry and the

number of groups were chosen because they representsimple, intuitive concepts in design and they are relatedto the measures described or utilized earlier in the analysisof alphanumeric displays. For example, Streveler andWasserman (1984) proposed several types of screenmeasures including the aesthetic measures of balanceand symmetry. The work by Tullis (1984) on interfaceusability metrics included the number of the groups.In the current study, balance is derived from summingthe visual moments in the horizontal and vertical direc-tions to zero about the balance point. Symmetry isanalyzed with a pixel-by-pixel comparison about a centralaxis of reflection giving more influence to comparisonscloser to the axis because of the increased visual influencein that area. The number of groups is a simple count of thedistinct visual groups that exist in an image. Becauserectangles or pictures are used as compositional elementsthe number of groups is trivial to compute in bothexperiments.The validity of the quantitative analysis for symmetry

and balance was tested through subject ratings ofvarious images with the ratio-scale magnitude estima-tion method. The experimental stimuli were createdaccording to target parameters for symmetry, balanceand composition elements and all were compared to abenchmark image. Three experimental sessions werecompleted where: (1) the subjects rated the aestheticappeal of the images compared to the benchmark;(2) subjects ordered images from least appealing to mostappealing in a Balanced-Incomplete-Block (BIB) metho-dology to establish an overall preference of the images;and (3) subjects rated their impression of the imagebalance and symmetry in both the horizontal andvertical directions as well as rated the aesthetic appeal fora second time.The results show that subject responses validated the

methods for determining symmetry and balance. Addi-tionally, aesthetic appeal was highly dependent on sym-metry for the basic images in Experiment 1 and on thenumber of groups for the web pages in Experiment 2.

ARTICLE IN PRESS

A B C D E F

1

2

3

4

5

Fig. 2. 5� 6 pixel bitmap with asymmetric pixel pairs at coordinates A3

and F3 with a vertical axis of reflection.

M. Bauerly, Y. Liu / Int. J. Human-Computer Studies 64 (2006) 670–682 673

2. Image quantification methods

Three basic compositional attributes were each quanti-fied on individual scales. Simple algorithms were created inan attempt to mimic the human cognitive representation ofthe attributes of symmetry and balance, with the thirdattribute being the number of compositional buildingblocks used in the image.

2.1. Symmetry

Symmetry, s, is an analysis of the similarity of pixels onopposite sides of an axis of reflection. This particular typeof symmetry is referred to as ‘bilateral symmetry’. Thisalgorithm takes a microscopic approach and compareseach half of an image pixel by pixel, as opposed to amacroscopic view that might compare higher level elementssuch as specific shapes or lines. It should also be noted thatthis algorithm only considers symmetry about a vertical orhorizontal axis, but the general strategy can be utilizedabout any axis of reflection.

It was hypothesized that those pixel comparisons thatare close to the axis of reflection have a higher influence onthe overall impression of symmetry than do comparisonswhich are further away from the reflection axis. TakeFigs. 1 and 2 as illustrations of this assumption. Eachimage is 5� 6 pixels and has only one non-matching pixelpair. Fig. 1 has a non-matching pixel pair closer to thevertical axis of symmetry (at coordinates C3 and D3) thandoes Fig. 2 (at coordinates A3 and F3). According to thealgorithm discussed below, Fig. 1 is less symmetric thanFig. 2 because as the pixel comparisons move farther awayfrom the axis of reflection, their influence on the overallsymmetry values are decreased.

Eq. (1) below gives the equation for symmetry. Thevariable m is the pixel length of the image dimension that isparallel to the axis of reflection. For example, if the axis ofreflection is vertical, m is the height of the image in pixels.The variable n is the number of comparisons requiredin each row or column of pixels. Taking the case wherethe axis of reflection is again vertical, the number ofcomparisons is the image width in pixels divided by 2 when

A B C D E F

1

2

3

4

5

Fig. 1. 5� 6 pixel bitmap with asymmetric pixel pairs at coordinates C3

and D3 with a vertical axis of reflection.

the width is an even number of pixels, or it is the imagewidth less 1 divided by 2 when the width is odd. Thesymmetry factor Xij becomes a binary variable which isequal to one when the pixel pairs are the same and zerowhen they are opposite. To extend this algorithm to amulti-color image, Xij could be defined for each combina-tion of colors such that black might have a highersymmetry factor with dark gray than with white.As discussed illustratively above, Eq. (1) gives a positive

comparison (X ij ¼ 1) at the edge of the image less influenceon the overall symmetry value than that of a positivecomparison occurring at the axis of reflection. Specifically,a positive comparison at the axis of reflection is deemed tobe twice as influential as one at the farthest distance away

s ¼2

3mn

Xm

i¼1

Xn

j¼1

X ij 1þj � 1

n� 1

� �. (1)

2.2. Balance

The balance point, b, is the Cartesian coordinate at thecenter of the visual mass of the image. This center caneasily be found once individual masses are assigned to eachpixel. In this experiment, black pixels are given a mass ofone and white pixels given a mass of zero. Eqs. (2) and (3)give the center of balance as (xb, yb) where w is the imagewidth in pixels, h is the image height in pixels and W, thevisual weight, is the summation of black pixels in each pixelcolumn (Eq. (2)) or row (Eq. (3)). For the experimentalprocedure and analysis, b is given as a set of normalizedcoordinates between zero and one as in Eq. (4)

Xw

x¼1

W xðx� xbÞ ¼ 0, (2)

Xh

y¼1

W yðy� ybÞ ¼ 0, (3)

b ¼xb

w;yb

h

� �. (4)


Although it is somewhat related to s, the measure of b isdistinct. The strongest relationship between the two measuresoccurs when an image is perfectly symmetric. For example,an image that is perfectly symmetric (s ¼ 1) about a verticalline will have a balance point that lies somewhere along thatline of reflection and thus the x-coordinate of b will be 0.5. Asanother example, Fig. 3 gives an image with perfectlycentered balance (0.5, 0.5) but less than perfect symmetry.

2.3. Number of groups

The number of groups, n, is a simple count of the distinctvisual groups that exist in an image. These experiments userectangles or pictures as compositional elements and thusthe number of individual blocks in each image is trivial tocalculate. Inclusion of this attribute was driven by thehypothesis that images with a higher n would potentially beless focused, which could have a significant effect onaesthetic appeal.

3. Experiment 1

The image quantification metrics described above areevaluated in Experiment 1 with abstract black-and-whiteimages.

A B C D E F

1

2

3

4

5

Fig. 3. 5� 6 pixel bitmap with centered balance (0.5, 0.5) but imperfect

symmetry.

Fig. 4. Example of experimental stimuli from Experiment 1. Shown here is the

(right).

3.1. Methods

3.1.1. Participants

Sixteen subjects aged 21–29 participated in each of threeexperimental sessions. All subjects had normal (20/20) orcorrected-to-normal vision and normal color vision. Artand architecture students were not allowed to participate inthe experiment in order to avoid introduction of anypotential influence of specialized aesthetic training orbackground. The entire experimental procedure tookapproximately 1 h and subjects were compensated $10.00for their participation.

3.1.2. Stimuli

Thirty images were developed to target values of s, b andn, including one image as a benchmark and one as apreview with which the subjects rehearsed the questions foreach of three sessions. While the images were designedspecifically for combinations of the three compositionattributes, there exist an infinite number of possible imagesfor each set. Care was taken to make their design ashomogenous as possible. This was done by maintainingsimilar design elements such as overall visual mass, whitespace around each rectangle and similar spacing strategiesbetween rectangles. Fig. 4 gives an example of the imagery,showing the benchmark image used in comparison toall images alongside the image used for rehearsing theexperimental procedure.

3.1.3. Procedure

All data were collected by recording verbal responses toa standard set of questions about each image or groupof images. The experiment was conducted in a sound-insulated, well-lit experimental lab. Participants sat at adesk opposite the experimenter and viewed all images on a17-inch CRT monitor at 1024� 768 pixel resolution, withall images measuring 400 pixels square.In the first experimental session, subjects were asked to

use the magnitude estimation method to rate the 28 testimages. They were instructed to rate the overall aesthetic

benchmark image (left) and the example image used for question rehearsal

ARTICLE IN PRESS

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

0.00 0.20 0.40 0.60 0.80 1.00 1.20Horizontal Symmetry Algorithm

Exp

1:

Mea

n S

ub

ject

Rat

ing

Fig. 5. Log of mean subject ratings for symmetry about a horizontal axis

are plotted against the s values.

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

0.00 0.20 0.40 0.60 0.80 1.00 1.20Vertical Symmetry Algorithm

Exp

1:

Mea

n S

ub

ject

Rat

ing

Fig. 6. Log of mean subject ratings for symmetry about a vertical axis are

plotted against the s values.


appeal of each test image provided that the benchmarkimage was rated as a 10. For example, if the test image wastwice as appealing as the benchmark then it was rated as a20 and if it was half as appealing it was rated as a 5. Thisrating method allows the subjects to use any positivenumber they see fit and their ratings are not restricted byfixed scales. Each image was displayed on a screen next tothe benchmark image and the presentation sequence wasrandomly ordered for each subject and each trial. Subjectswere encouraged to give a rating quickly and not to thinkabout the images for too long. As soon as a rating wasgiven verbally, the next test image in the sequence wasdisplayed.

The second session required subjects to complete aBIB ranking of 24 of the 28 test images along withthe benchmark image, making 25 images total. With theBIB ranking procedure, the images were presented ingroups of 4 rather than all 25 images all at once. Thisprocedure considers human perceptual, memory andjudgment capacity and thus it helps obtain more reliableresults than asking subjects to judge a large numberof images at once. For each group of 4 images, theimages were ranked from the least aesthetically appealing(‘‘the ugliest’’) to the most aesthetically appealing(‘‘the prettiest’’). The BIB was a complete design suchthat each image was seen eight times and was comparedto all other images in the comparison set of 25. Asin the previous session, as soon as the rank orderof a group was given, the next set of 4 images wasdisplayed.

In the last session, the magnitude estimation methodwas again used to rate the images compared to thebenchmark on multiple scales. Subjects gave two separateratings for symmetry about the horizontal and verticalaxes with a higher rating corresponding to a higher degreeof symmetry. Two ratings were given for horizontaland vertical balance, respectively, with higher ratings givento images that were well-balanced and centered andlower ratings given to images that were heavily skewed inone direction. Ideally, subjects would give the same ratingto an image that had b ¼ ð0:3; 0:3Þ as one that had b ¼

ð0:7; 0:7Þ because both would have horizontal and verticalbalance points the same distance from the center point ofthe image. Subjects were then asked again to rate theoverall aesthetic appeal of the images compared to thebenchmark.

3.2. Results and discussion

The data collected using the magnitude estimationmethod is log-normally distributed and thus the dataanalysis uses the log of the geometric mean subject ratingswhere appropriate. This allows for linear relationships tobe established between subjective ratings of symmetryand balance and the corresponding s and b values, as wellas for a regression model of aesthetic appeal ratings basedon s, b and n.

3.2.1. Symmetry ratings

The log of the mean subject symmetry ratings for thisexperiment is plotted against the s values for symmetryabout a horizontal and vertical axis in Figs. 5 and 6.The linearity of the data plots allows for the subject

ratings of symmetry to be reflected as a linear regressionfunction of the s value of the image. Eqs. (5) and (6) givethe log of the subject rating, r, as a function of s forsymmetry about horizontal and vertical axes, respectively

LOGðrHORSYM1Þ ¼ 1:74sHOR � 0:37; R2 ¼ 0:78, (5)

LOGðrVERTSYM1Þ ¼ 1:83sVERT � 0:48; R2 ¼ 0:70. (6)

These equations and the high R2 values indicate thatsubjects were quite adept at rating the symmetry of theimages and that their ratings corresponded with the s

values of the images.

3.2.2. Balance ratings

The log of the mean subject ratings for horizontal andvertical balance in Experiment 1 are plotted against xb andyb, respectively, in Figs. 7 and 8. Subjects were asked togive high ratings to an image if it was evenly balanced and

ARTICLE IN PRESS

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

0.60 0.70 0.80 0.90 1.00 1.10Transformed Horizontal Balance Algorithm

Exp

1:

Mea

n S

ub

ject

Rat

ing

Fig. 9. Log of mean subject ratings for horizontal balance are plotted

against xb0 values. As xb0 increases, the balance of the image becomes more

horizontally centered.

M. Bauerly, Y. Liu / Int. J. Human-Computer Studies 64 (2006) 670–682676

to give lower ratings to images where balance was skewedin one direction, thus the subject ratings increase as balancevalues approach 0.5 (perfect balance) from either direction.

Both figures show that subjects were able to judge thebalance of images quite well, with lower ratings becomingmore common as balance moves away from 0.5 ineither direction. The high degree of certainty for imageswith balance values of 0.5 shows that subjects are quiteconfident that these images are much more balanced thanthe benchmark image, which has b ¼ ð0:45; 0:55Þ.

As the graphs illustrate, subject ratings of horizontalbalance were not highly influenced by whether images wereskewed to the left or to the right and ratings of verticalbalance were not highly influenced by whether images wereskewed to the top or the bottom. This result allows thebalance attributes to be slightly transformed such thatbalance points which are equidistant from the middle pointare merged. For example, horizontal balance points at 0.4and 0.6 are assigned the same value as are balance points at0.3 and 0.7. This transformation is illustrated mathemati-cally in Eq. (7) such that b0 is the new balance measure andb is the existing measure. This transformation suggests a

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

0.20 0.40 0.60 0.80Horizontal Balance Algorithm

Mea

n S

ub

ject

Rat

ing


against xb values.

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

0.20 0.40 0.60 0.80Vertical Balance Algorithm

Mea

n S

ub

ject

Rat

ing

Fig. 8. Log of mean subject ratings for vertical balance are plotted against

yb values.

balance measure that is similar in nature to that proposedby Streveler and Wasserman (1984)

b0 ¼ 1� j2b� 1j. (7)

Values of b0 can vary from 0, in the case where thebalance point of an image is at the extreme edge, to 1,where the balance point is the middle of the balance axis.The effectiveness of this transformation can now be seensuch that it allows for a functional relationship between thebalance quantification and subject ratings. The values ofmean subject ratings are plotted against b0 for horizontaland vertical balance in Figs. 9 and 10, respectively.As with symmetry, subject ratings of balance can be

given as a function of the transformed b0. Eqs. (8) and (9)give the log of the mean subject rating of balance as afunction of b0 for balance in the horizontal and verticalplanes

LOGðrHORBAL1Þ ¼ 1:90b0X � 0:65; R2 ¼ 0:74, (8)

LOGðrVERTBAL2Þ ¼ 1:83b0Y � 0:57; R2 ¼ 0:79. (9)

Subjects were particularly confident in rating an imagewith perfect symmetry as having perfect balance because,by definition, an image with s ¼ 1, must have a balancerating of b ¼ 0:5 (or b0 ¼ 1:0) in the correspondingdimension. Subjects gave lower ratings to images thathad perfect balance but were not perfectly symmetric.Overall, however, subjects were able to distinguish betweenvarying levels of balance with a sufficient degree ofaccuracy and their judgments reflected the quantificationmethod.

3.2.3. Aesthetic appeal ratings

The complete ordered aesthetic scores obtained with theBIB ranking method are given in Fig. 11, with the mostaesthetically appealing image in the upper left and the leastaesthetically appealing in the lower right. The aestheticscores scale all the images along an interval scale with themost appealing image receiving a score of 11.13 and the

ARTICLE IN PRESS

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

0.60 0.70 0.80 0.90 1.00 1.10Transformed Vertical Balance Algorithm

Exp

1:

Mea

n S

ub

ject

Rat

ing

Fig. 10. Log of mean subject ratings for vertical balance are plotted

against yb0 values. As yb0 increases, the balance of the image becomes more

vertically centered.


least appealing being scored at 0. It is very interesting tonote that the top 8 scored images are all symmetrical aboutone or both axis and that the bottom 17 images have nosymmetry.

A qualitative analysis of the images shows that there is adistinct difference between the non-symmetric images withscores higher than 5.5 and those that are less appealing.While it is not quantified with any of the attributes in theexperiment, the images ranked less appealing than anaesthetic score of 5.5 lack a certain coherency or focus thatis present in the images scored higher.

While the attributes of symmetry and balance werevalidated by subject ratings and the BIB results give someinsight into how the measures, particularly symmetry,influence aesthetic preferences, the question remains aboutexactly what influence the attributes have on the aestheticappeal.

It was hypothesized that the maximum value ofhorizontal and vertical symmetry may have some relationto the aesthetic ratings, particularly in light of the resultsfrom the aesthetic scores found in the BIB ranking. This isbased on the assumption that, independent of the direction,the existence of any symmetry increases the aestheticappeal of an image. Eq. (10) gives the log of the meanaesthetic appeal rating, r, as a function of smax, themaximum of the horizontal or vertical s value. Statisticalanalysis showed no difference between the ratings ofaesthetic appeal obtained in trials 1 and 3 and thus themean of the two trials is used as r

LOGðrAESTHETICÞ ¼ 0:68sMAX þ 0:46; R2 ¼ 0:59. (10)

4. Experiment 2

In contrast to Experiment 1, in which abstract black-and-white images were used, Experiment 2 employs webpages to evaluate the image quantification metrics. The

web pages were developed in a way to match the abstractimages in their composition.

4.1. Methods

4.1.1. Participants

For Experiment 2, 16 subjects aged 18–27 participated infour experimental sessions. All subjects had normal (20/20)or corrected-to-normal vision and normal color vision.None of the subjects participated in Experiment 1. As inExperiment 1, art and architecture students were notallowed to participate in the experiment to avoid introduc-tion of any potential influence of specialized aesthetictraining or background. The entire experimental proceduretook approximately 1 h and 15min and subjects werecompensated $10.00 per h for their participation.

4.1.2. Stimuli

The same 30 compositions from Experiment 1 were usedin Experiment 2, but their features were altered to createimages that look like web pages. Fig. 12 gives an exampleof the stimuli used in Experiment 2, showing the bench-mark web page and the web page used for questionrehearsal. Because the same compositions were usedin both experiments, the underlying s, b and n valuesremained unchanged for the second experiment. This wasdone under the assumption that the web page text isconsidered as the background and that the images in theweb pages are analogous to the solid blocks used inExperiment 1.

4.1.3. Procedure

The same experimental procedure was used in Experi-ment 2 as in Experiment 1 with one minor modification.The experiment was arranged in the following four stages,where stage 3 deviated from the procedure from Experi-ment 1: (1) subjects rated the aesthetic appeal; (2) subjectscompleted a BIB ranking; (3) subjects rated the aestheticappeal for a second time; and (4) subjects rated balance,symmetry and aesthetic appeal. This design allows for theaesthetic appeal ratings to be checked for repeatabilityfrom stage 1 to stage 3 and to account for any noveltyeffects that may arise in rating the images for the first time.The subjects were instructed to make their judgments solelyon the basis of the overall layout of the webpage, not onthe content of the images or the texts. The 11-point webpage text was shown at a resolution of 50% making itvirtually impossible to read. To reduce or minimize anypotential novel effects of the photographs, all the photo-graphs were shown to each subject prior to the experi-mental trials.

4.2. Results and discussion

4.2.1. Symmetry ratings

The log of the mean subject symmetry ratings forExperiment 2 are plotted against the s values for symmetry

ARTICLE IN PRESS

Fig. 11. BIB results given in aesthetic scores from the most appealing (upper left) to least appealing (lower right). The aesthetic score is given below for

each image, with higher numbers represent more aesthetically appealing images.


about a horizontal and vertical axis in Figs. 13 and 14.Eqs. (11) and (12) give the log of the subject rating, r, as afunction of s for symmetry about horizontal and verticalaxes, respectively

LOGðrHORSYM2Þ ¼ 1:25sHOR � 0:071; R2 ¼ 0:75, (11)

LOGðrVERTSYM2Þ ¼ 1:28sVERT � 0:11; R2 ¼ 0:69. (12)

The regression equations show that subject symmetryratings of the web pages had a strong relationshipwith s and that it provides a good estimation of subjectperceptions of symmetry.

ARTICLE IN PRESS

Fig. 12. Example of experimental stimuli from Experiment 2. Shown here is the benchmark image (left) and the example image used for question rehearsal

(right).

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

0.00 0.20 0.40 0.60 0.80 1.00 1.20Horizontal Symmetry Algorithm

Exp

2:

Mea

n S

ub

ject

Rat

ing

Fig. 13. Log of mean subject ratings for symmetry about a horizontal axis


0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

0.00 0.20 0.40 0.60 0.80 1.00 1.20Vertical Symmetry Algorithm

Exp

2:

Mea

n S

ub

ject

Rat

ing

Fig. 14. Log of mean subject ratings for symmetry about a vertical axis



4.2.2. Balance ratings

The log of the mean subject ratings for horizontaland vertical balance in Experiment 2 are plotted againstxb and yb, respectively, in Figs. 15 and 16. Similar tothe balance results from Experiment 1, balance ratings arenot dependent on whether the web page was skewedin one direction or the other so b can be transformedto b0 again in order to establish a relationship withsubject ratings. The log of the mean subject ratings areplotted against the values of b0 for both horizontaland vertical balance in Figs. 17 and 18. Eqs. (13) and(14) give the log of the mean balance rating as a function ofb0 for balance in the horizontal and vertical planes,respectively

LOGðrHORBAL2Þ ¼ 1:76b0X � 0:69; R2 ¼ 0:72, (13)

LOGðrVERTBAL2Þ ¼ 1:76b0Y � 0:65; R2 ¼ 0:85. (14)

Similar to the findings regarding balance from Experi-ment 1, subjects were able to accurately judge the balanceof the web images in Experiment 2 and that judgmentmatched the balance metric.

4.2.3. Aesthetic appeal ratings

The complete ordered aesthetic scores obtained with theBIB ranking method are given in Fig. 19, with the mostaesthetically appealing image in the upper left and the leastaesthetically appealing in the lower right. The aestheticscores place the images on an interval scale with the mostappealing image receiving a score of 15.88 and the leastappealing being scored at 0.Unlike in the first experiment, where symmetry was the

primary indicator of a higher aesthetic score, the greatestpredictor of a high aesthetic score appears to be n, thenumber of groups. The top 3 web pages have 2 or 3 groupsand almost all of the web pages with a smaller n are in the

ARTICLE IN PRESS

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

0.20 0.40 0.60 0.80Vertical Balance Algorithm

Mea

n S

ub

ject

Rat

ing


against yb values for Experiment 2.

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

0.60 0.70 0.80 0.90 1.00 1.10Transformed Horizontal Balance Algorithm

Exp

2:

Mea

n S

ub

ject

Rat

ing


against xb0 values. As xb0 increases, the balance of the image becomes more

horizontally centered.

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

0.60 0.70 0.80 0.90 1.00 1.10Transformed Vertical Balance Algorithm

Exp

2:

Mea

n S

ub

ject

Rat

ing


against yb0 values. As yb0 increases, the balance of the image becomes more

vertically centered.

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

0.20 0.40 0.60 0.80Horizontal Balance Algorithm

Mea

n S

ub

ject

Rat

ing


against xb values for Experiment 2.


top half of the ranking, while the majority of the web pageswith the largest n tested (n ¼ 7) are in the lower half of theranking. One of the major factors underlying this conclu-sion appears to be an increased sense of organization, such

that the more well-organized web pages with fewerelements are ranked higher.Just as the results from the BIB ranking indicate that n

was the largest predictor of an increased aesthetic appealranking, it is highly likely that the number of groups alsoplayed a part in the ratings given in trials 1, 3 and 4.Eq. (15) gives the log of the mean aesthetic appeal rating, r,as a function of n. Statistical analysis showed no differencesbetween the ratings of aesthetic appeal obtained in trials 1,3 and 4 and thus the mean of these ratings is used as r.Using n as a proxy for the image complexity, this equationindicates that subjects rated less complex web pages asmore aesthetically appealing

LOGðrAESTHETICÞ ¼ �0:06nþ 1:39; R2 ¼ 0:30. (15)

5. Conclusions

The central purpose of these two experiments wasto create simple methods to quantitatively analyzeand describe the composition of visual imagery. Theexperimental methodology was designed to use a bottom-up process such that the overall aesthetic appeal of animage might prove to be partially based on individualcompositional attributes. Formulae for both symmetry andbalance in both horizontal and vertical dimensions weredeveloped and validated against subject ratings for thoseattributes.Additionally, a strong relationship between perfect

symmetry and overall aesthetic appeal was shown in thebasic imagery of Experiment 1, but it was shown todiminish in the more realistic looking web pages used inExperiment 2. These findings lend support to the aesthetictheories that emphasize the organizing role of symmetry inaiding the viewer’s understanding of pictorial composition.This understanding is reflected in the higher ratings of thesesymmetric images.

ARTICLE IN PRESS

Fig. 19. BIB results given in rank order from the most appealing (upper left) to least appealing (lower right). The aesthetic score is given below each web

page, with higher numbers represent more aesthetically appealing images.


In a finding similar to that of Tullis (1984), the use of alarge number of groups in the web pages was found to havenegative effects on the dependent measure. While Tullisused response time to show the relationship, this experi-ment used the aesthetic appeal. It is highly likely that thismeasure is a proxy for the complexity of the interface. Just

as Tullis suggested that displays can become too compli-cated and dense, the value of n in the second study is anindicator of the same sense of overcrowding and complex-ity.These findings can help influence design in multiple

ways. The use of symmetry in very basic imagery, such as


icons or logos has a high probability of making thoseimages more aesthetically appealing. The use of symmetryin more complex interfaces, such as web pages, becomesless important than making sure the web page is clear andcoherent. This can be achieved by reducing the number ofcompositional elements to the lowest amount possible.

The findings of the present study also lend support to theimportant role of psychophysical theories of aestheticsemphasizing the quantitative relationships between aes-thetic response and basic pictorial elements, as investigatedin the present study. It should be noted, however, that thepresent study was designed in a way that minimizes theinfluence of the pictorial contents on aesthetic judgmentsthrough the use of abstract stimuli in Experiment 1 and thesame set of pictures seen in advance in Experiment 2. It ishighly likely that other schools of theories (Liu, 2003a)will also play an important role in task settings thatemploy content-rich stimuli and involve active userparticipation. For example, Kaplan (1987) emphasizes therole of information interchange between the user and theenvironment, in which people are not passive recipients butactive seekers of information and the preferred environ-ments are those that satisfy these informational needs. Weplan to extend the research reported in this article toexamine the joint contributions of compositional elementsand information content when users actively explore a tasksituation.

References

Boring, E., 1950. A History of Experimental Psychology, second ed.

Appleton-Century-Crofts, Inc., New York.

Buss, D.M., Barnes, M., 1986. Preferences in human mate selection.

Journal of Personality and Social Psychology 50, 559–570.

Cropper, A.G., Evans, S.J.W., 1968. Ergonomics and computer display

design. The Computer Bulletin 12 (3), 94–98.

Hallnas, L., Redstrom, J., 2002. From use to presence: on the expressions

and aesthetics of everyday computational things. ACM Transactions

on Computer-Human Interaction 9 (2), 106–124.

Healey, C.G., Enns, J.T., 2002. Perception and painting: a search for

effective, engaging visualizations. IEEE Computer Graphics and

Applications 22 (2), 10–15.

Jordan, P.W., 1997. Human factors for pleasure in product use. Applied

Ergonomics 29 (1), 25–33.

Kaplan, S., 1987. Aesthetics, affect, and cognition: environmental

preference from an evolutionary perspective. Environment and

Behavior 19, 3–32.

Kim, J., Lee, J., Choi, D., 2003. Designing emotionally evocative

homepages. International Journal of Human-Computer Studies 59,

899–940.

Langlois, J.H., Roggman, L.A., 1990. Attractive faces are only average.

Pscyhological Science 1, 115–121.

Lavie, T., Tractinsky, N., 2004. Assessing dimensions of perceived visual

aesthetics of web sites. International Journal of Human-Computer

Studies 60, 269–298.

Liu, Y., 2003a. Engineering aesthetics and aesthetic ergonomics:

theoretical foundations and a dual-process research methodology.

Ergonomics 46, 1273–1292.

Liu, Y., 2003b. The aesthetic and the ethic dimensions of human factors

and design. Ergonomics 46, 1293–1305.

Liu, Y., Wickens, C.D., 1992. Use of computer graphics and cluster

analysis in aiding relational judgment. Human Factors 34 (2), 165–178.

Locher, P.J., 1996. The contribution of eye-movement research to an

understanding of the nature of pictorial balance perception: a review of

the literature. Empirical Studies of the Arts 14 (2), 143–163.

Locher, P.J., Nagy, Y., 1996. Vision spontaneously establishes the percept

of pictorial balance. Empirical Studies of the Arts 14 (1), 17–31.

Locher, P.J., Stappers, P.J., Oberbeeke, K., 1998. The role of balance as

an organizing design principle underlying adult’s compositional

strategies for creating visual displays. Acta Psychologica 99 (2),

141–161.

Ngo, D.C.L., 2001. Measuring the aesthetic elements of screen design.

Displays 22 (3), 73–78.

Ngo, D.C.L., Teo, L.S., Byrne, J.G., 2003. Modelling interface aesthetics.

Information Sciences: An International Journal 152 (1), 25–46.

Ognjenovic, P., 1991. Processing of aesthetic information. Empirical

Studies of the Arts 9 (1), 1–9.

Prince, P.D., 2000. Computer art in the new millennium. IEEE Computer

Graphics and Applications 20 (1), 26–27.

Reiser, H., Reiser, B., 1995. Aesthetic considerations unique to interactive

multimedia. IEEE Computer Graphics and Applications 15 (3), 24–28.

Sears, A., 1993. Layout appropriateness: a metric for evaluating user

interface widget layout. IEEE Transactions on Software Engineering

19 (7), 707–719.

Streveler, D.J., Wasserman, A.I., 1984. Quantitative measures of the

spatial properties of screen designs. In: INTERACT ’84 Conference

Proceedings. North-Holland, Amsterdam.

Tractinsky, N., Katz, A.S., Ikar, D., 2000. What is beautiful is usable.

Interacting with Computers 13 (2), 127–145.

Tullis, T.S., 1983. The formatting of alphanumeric displays: a review and

analysis. Human Factors 25 (6), 657–682.

Tullis, T.S., 1984. A computer-based tool for evaluating alphanumeric

displays. In: Proceedings of the INTERACT ’84 Conference on

Human-Computer Interaction, London, September 1984.

Tullis, T.S., 1988a. A system for evaluating screen formats: research

and application. In: Hartson, H.R., Hix, D. (Eds.), Advances

in Human-Computer Interaction, vol. 2. Ablex, Norwood, NJ,

pp. 214–286.

Tullis, T.S., 1988b. Screen design. In: Helander, M. (Ed.), Handbook of

Human-Computer Interaction. Elsevier Science Publishers B.V.,

North-Holland, Amsterdam, pp. 377–411.

Wilson, A., Chatterjee, A., 2005. The assessment of preference for

balance: introducing a new test. Empirical Studies of the Arts 23 (2),

165–180.

World, L., 1996. Aesthetic selection: the evolutionary art of Steven Rooke.

IEEE Computer Graphics and Applications 16 (1), 4.

Documents

Computational modeling and experimental …yililiu/Bauerly-Liu-IJHCS-2006.pdfInt. J. Human-Computer Studies 64 (2006) 670–682 Computational modeling and experimental investigation