26
- Supported by a grant from the Mexican Council for Science and Technology, CONACYT. Int. J. Human-Computer Studies (2001) 54, 211}236 doi:10.1006/ijhc.2000.0408 Available online at http://www.idealibrary.com on Focal structures and information types in Prolog PABLO ROMERO- School of Cognitive and Computing Sciences, The University of Sussex, Brighton, BN1 9QN, U.K. email: juanr@cogs.susx.ac.uk Several studies have suggested that the mental structures of programmers of procedural languages have a close relationship with a model of structural knowledge related to functional information known as programming Plans. It also has been claimed that experienced programmers organize this representation in a hierarchical structure where some elements of Plans are focal or central to them. However, it is not clear that this is the case for other types of programming languages, especially for those which are signi"- cantly di!erent from the procedural paradigm. The study reported in this paper investigates whether these claims are true for Prolog, a language which has important di!erences to procedural languages. Prolog does not have obvious syntactic cues to mark blocks of code (begin/end, repeat/until, etc). Also, its powerful primitives (uni"cation and backtracking) and the extensive use of recursion might in#uence how programmers comprehend Prolog code in a signi"cant way. The "ndings of the study suggest that Plans and functional information are important for Prolog programmers, but that there is also at least another model of structural knowledge valid for this language. This model of structural knowledge, Prolog schemas, is related to data structure information and it seems that a hierarchical organisation that highlights the relevance of some of its elements as focal is valid for Prolog. These results support the view that comprehension involves the detection of varying aspects of the code and that each of the structures related to these aspects might have their own organization and hierarchical relations. ( 2001 Academic Press 1. Introduction Program comprehension is a skill that is central to programming, so having a clear picture of comprehension as a cognitive process is a prerequisite to building models of programming tasks such as debugging, modi"cation, reuse, etc. Yet comprehension has been studied mainly for languages belonging to the procedural programming paradigm. Programmers are said to be able to comprehend computer programs because of the structural and strategic knowledge about programming that they have acquired through experience. The programming Plan concept has been the dominant model used to represent the structural knowledge that programmers possess (De H tienne, 1990). Pro- gramming Plans are said to be strongly related to what the program does (functional information). It has been suggested that Plans typically comprise several elements with 1071-5819/01/020211#26 $35.00/0 ( 2001 Academic Press

Focal structures and information types in Prolog

Embed Size (px)

Citation preview

Int. J. Human-Computer Studies (2001) 54, 211}236doi:10.1006/ijhc.2000.0408Available online at http://www.idealibrary.com on

Focal structures and information types in Prolog

PABLO ROMERO-

School of Cognitive and Computing Sciences, The University of Sussex, Brighton,BN1 9QN, U.K. email: [email protected]

Several studies have suggested that the mental structures of programmers of procedurallanguages have a close relationship with a model of structural knowledge related tofunctional information known as programming Plans. It also has been claimed thatexperienced programmers organize this representation in a hierarchical structure wheresome elements of Plans are focal or central to them. However, it is not clear that this is thecase for other types of programming languages, especially for those which are signi"-cantly di!erent from the procedural paradigm.

The study reported in this paper investigates whether these claims are true for Prolog,a language which has important di!erences to procedural languages. Prolog does nothave obvious syntactic cues to mark blocks of code (begin/end, repeat/until, etc). Also, itspowerful primitives (uni"cation and backtracking) and the extensive use of recursionmight in#uence how programmers comprehend Prolog code in a signi"cant way.

The "ndings of the study suggest that Plans and functional information are importantfor Prolog programmers, but that there is also at least another model of structuralknowledge valid for this language. This model of structural knowledge, Prolog schemas,is related to data structure information and it seems that a hierarchical organisation thathighlights the relevance of some of its elements as focal is valid for Prolog. These resultssupport the view that comprehension involves the detection of varying aspects of thecode and that each of the structures related to these aspects might have their ownorganization and hierarchical relations.

( 2001 Academic Press

1. Introduction

Program comprehension is a skill that is central to programming, so having a clearpicture of comprehension as a cognitive process is a prerequisite to building modelsof programming tasks such as debugging, modi"cation, reuse, etc. Yet comprehensionhas been studied mainly for languages belonging to the procedural programmingparadigm.

Programmers are said to be able to comprehend computer programs because of thestructural and strategic knowledge about programming that they have acquired throughexperience. The programming Plan concept has been the dominant model used torepresent the structural knowledge that programmers possess (DeH tienne, 1990). Pro-gramming Plans are said to be strongly related to what the program does (functionalinformation). It has been suggested that Plans typically comprise several elements with

-Supported by a grant from the Mexican Council for Science and Technology, CONACYT.

1071-5819/01/020211#26 $35.00/0 ( 2001 Academic Press

212 P. ROMERO

di!erent degrees of relevance. This hierarchical organization can explain why experi-enced programmers organize the programs they comprehend in such a way that someparts of them become more relevant than others (Davies, 1993, 1994).

However, these claims are yet to be tested for programming paradigms other than theprocedural one. A programming language signi"cantly di!erent from this trend, andtherefore suitable to explore these claims is Prolog. Several studies have tried, withoutmuch success, to "nd evidence of a relationship between the programming Plans ideaand the mental representations of Prolog programmers (Bellamy & Gilmore, 1990;Oremerod & Ball, 1993).

This paper explores the nature of Prolog programmers' structural knowledge, ad-opting the model proposed by Davies (1993, 1994), but extending the notion of focalstructures to consider not only function but also other programming information types.This investigation is performed through two experiments. The "rst one applies a pro-gram recall task to "nd out which of several structural models are relevant for Prologprogrammers, while the second one compares the importance of two speci"c structuralmodels for this same programming language through a code recognition task.

This paper is divided into four parts. The next section gives a brief account of programcomprehension studies. The subsequent sections describe and discuss the two experi-ments and the "nal part analyses the global results.

2. Program comprehension

Several models of the programming task propose that a program comprises several kindsof information (Brooks, 1983; Pennington, 1987). Program comprehension is a complexcognitive process that involves the acknowledgement and understanding of these in-formation types implicit in a program. The result of this process is more or less a detailedmental representation that the programmer builds of the program she has studied. Thequalities of this mental representation vary according to several factors, among them theprogrammer's skill level, the size of the program and the task in hand.

In order to understand the sources of knowledge that programmers use to build thesemental representations during comprehension, several structural models of this program-ming knowledge have been proposed. Most of these structural models are based ona theory of knowledge organization in memory known as schema theory (Schank& Abelson, 1997). This theory proposes that comprehension is mediated by schematicstereotypical knowledge structures or schemas. A schema is therefore a data structure forrepresenting, in memory, generic concepts of a domain.

These structural models propose speci"c structures that are said to be a good approxi-mation of the internal knowledge structures that enable programmers to organize theircomprehension of programs in a particular way. This structural organization sometimeshighlights speci"c information types. The next subsections describe the program compre-hension process, the programming structural models and information types in more detail.

2.1. THE COMPREHENSION PROCESS

One of the earliest theories of program comprehension was introduced by Brooks (1983),who proposed a theoretical framework to understand behavioural di!erences in

FOCAL STRUCTURES IN PROLOG 213

program comprehension. He regards comprehension as a process of domain reconstruc-tion. This reconstruction involves establishing mappings from the problem domain tothe program domain via some other intermediate domains. This process of establishingmappings consists of generating and re"ning hypotheses about program execution andits relation to the other domains. Hypothesis re"nement is performed in a top-downfashion. This process begins with a primary, top-level hypothesis which is decomposedinto several subsidiary, more speci"c hypotheses. The generation of hypotheses isperformed by retrieving structural units from the programmer's knowledge. Thesestructural knowledge units are used to generate more hypotheses or are matched againstthe program's code. As a result of this matching process, the code is organized intomeaningful chunks or units. These chunks can be considered as the external analogues ofthe programmer's structural knowledge. According to Brooks, in order to perform thisorganization of the program into meaningful chunks, programmers look for speci"cpatterns in the code which can con"rm the proposed hypothesis. These patterns of codeare known as beacons or key segments of the code's meaningful chunks.

The next section describes several models proposed to explain the nature and charac-teristics of the programmers' internal structural knowledge used in the comprehensionprocess.

2.2. STRUCTURAL MODELS

When talking about programming knowledge there are four important concepts thathave to be di!erentiated: structural knowledge, mental representation, structural modeland program chunks. Structural knowledge has to do with knowledge that programmershave about semantic and syntactic structures in programming (DeH tienne, 1990). A mentalrepresentation, as mentioned above, is the internal representation the programmerbuilds of the program she is studying. A structural model can be used to explain someaspects of the programmer's structural knowledge. Program chunks are the meaningfulunits in which the comprehension process organizes the program code. A useful di!eren-tiation of these four concepts is that the former two are internal representations while thelatter two are external structures proposed to explain some aspects of the internal ones.Also, while structural knowledge and structural models have to do with general know-ledge about programming, a mental representation and a program chunk are related tothe speci"c problem that has been studied.

If, for example, a programmer is trying to comprehend a speci"c piece of code,knowledge about data objects in the particular programming language she is employing,or about the way in which the clauses that make up a program in this language areexecuted, forms a part of her structural knowledge. What she can learn about thisparticular program through the comprehension process is her mental representation ofit. Hypotheses trying to explain the organization of her structural knowledge can beconsidered as a structural model. It could be hypothesized, for example, that animportant part of her structural knowledge has to do with how to implement speci"cgoals of the application domain in the programming language she is working with.Finally, this structural model might predict that when trying to validate her assumptionsabout the code the programmer groups speci"c segments of the program into what shemight consider meaningful chunks. If these program chunks, predicted by the structural

Goal: "nd an occurrence of ?xCODE PLAN TERMS

Plan: ?found :"false initialise to not foundloop through category of ?xif ?x then?found :"true set it to true2 :"?x use it

FIGURE 1. Plan description. From Gilmore and Green (1988).

214 P. ROMERO

model employed, can be used to explain certain characteristics of her behaviour, then itcan be said that this particular structural model is a good approximation of herstructural knowledge.

Several structural models have been proposed to explain certain aspects of program-ming knowledge. As mentioned above, most of these are based on schema theory(Schank & Abelson, 1977). One of the most successful of such structural models isprogramming Plans (Pennington, 1987; Gilmore & Green, 1988; Davies, 1990). Althoughthere is no agreed de"nition of Programming Plans, most of the Plans literature assumesthat they are units of schematic stereotypical programming knowledge related to theprogram's goals. Therefore, this structural model proposes that programmers usea goal-oriented criteria to organize their mental representations. Figure 1 gives anexample of a Plan instance.

Some studies have suggested that the elements that comprise these goal-oriented units ofprogramming knowledge have di!erent degrees of relevance or saliency for programmers(Rist, 1989, 1995; Davies, 1993, 1994). The elements that most directly encode the goal ofa particular Plan are said to be focal for experienced programmers. For example, if theprogramming task is to accumulate data and compute an average, the focal element of itwill be the division of the running total by the number of items (Rist, 1995). According toDavies (1993), the program key segments are analogous to these focal elements, the maindi!erence between these two concepts is that the former can be identi"ed as actualsegments of the code while the latter represent abstract structures. In code generationstudies (Rist, 1989; Davies, 1991), a concept similar to key segments has been identi"ed asfocal lines. Throughout this paper, the term &&key segments'' will be preferred becausesometimes these code segments will not be con"ned to a single line. Although Plans havebeen related to data-#ow as well as to functional information, focal elements of Plans seemto be highly related to function only. Because of the functional nature of the focal elementsof Plans, this paper considers Plans as related mainly to function.

Wiedenbeck (1986) gives empirical evidence that supports this notion of key segments.In her study, novices and experienced programmers tried to understand and memorizea short Pascal program. After this study period, they were asked to recall as much as theycould of the program code. The results showed that experienced programmers, unlikenovices, recalled key segments much better than other parts of the code. According toDavies (1993, 1994), as a result of experience, programmers restructure their program-ming knowledge by identifying the focal elements of Plans and assigning them a highplace in the Plan hierarchy. He gives empirical evidence that supports this notion of focalelements by performing an experiment that compares the recognition of key and non-key

p(X):*

g (X,Y),p (Y).

FIGURE 2. The before technique. From Bowles and Brna (1993).

length ([],L,L),

length ([HDT],L0,L):*

L1 is L0]1,length (T,L1,L).

FIGURE 3. An occurrence of the before technique. From Bowles and Brna (1993).

FOCAL STRUCTURES IN PROLOG 215

segments of Pascal programs for three groups of programmers: novices, intermediatesand experienced. His results show that experienced programmers, unlike the othergroups, are able to recognize key segments more quickly and more accurately thannon-key segments.

Studies of programming Plans and focal elements have considered mainly procedurallanguages. However, the literature for other programming paradigms (mainly for object-oriented languages) suggests that functional information might also play a central rolethere in the comprehension process (Davies, Gilmore & Green, 1995; Wiedenbeck& Ramlingam, 1999). Some studies have tried, without much success, to "nd evidence ofa relationship between Plans and Prolog programmers'mental representations (Bellamy& Gilmore, 1990; Ormerod & Ball, 1993).

There are alternative structural models for Prolog. Brna, Bundy, Dodd, Eisenstadt,Looi and Pain (1991) and Bowles and Brna (1993) propose that Prolog programmers'structural knowledge is related to &&Prolog Techniques''. This structural model is similarto Plans, but it comprises knowledge about how to perform speci"c Prolog operations.An instance of a basic programming technique is given in Figure 2. This technique'sinstance is called the before technique because the value of > is constructed in thesubgoal g and then sent to the recursive call. Figure 3 illustrates an occurrence of thisinstance in the predicate length/3. In this "gure, the elements of the technique's instanceare highlighted.

Another structural model for Prolog is &&Prolog Schemas''. Gegg-Harrison (1991)proposes this structural model and describes a set of common Prolog schema instancesfor list processing. A speci"c example from this set is given in Figure 4. In this example,@&nA denotes any number of Prolog arguments, clauses surrounded by ( ' areoptional and the rest are compulsory elements of the schema. This example deals withthe task of processing a list until the "rst occurrence of an element is found. The base caseensures that E, the element that is being searched for, is found. The second clauseoptionally checks that the list element being processed is not the one which is beinglooked for, performs an optional process, makes a recursive call trying to "nd theelement in the tail of the list and calls a second optional process. It can be seen that thecompulsory elements of the schema have to do with stereotypical operations on list datastructures. This particular schema instance is very similar to the example of a program-ming Plan instance given in Figure 1.

schema}C([EDT],E,@&1A).

schema}C([HDT],E,@&2A:*

(EC"H',(pre

}pred(@&3A,H,@&4A)',

schema}C(T,E,@&5A),

(post}pred(@&6A,H,@&7A)',

FIGURE 4. An example of a simple Prolog schema. From Gegg-Harrison (1991).

recursive}reduction(X0,Y0):*

trivial(X0,Y0).

recursive}reduction(X,Y):*

reduce(X,X1,C1),recursive

}reduction(X1,Y1),

combine(C1,Y1,Y).

FIGURE 5. An example of a Recursion points schema. From Le (1993).

216 P. ROMERO

For some authors, an important aspect for understanding a Prolog program isunderstanding its #ow of control (Le, 1993). In a typical Prolog program, a crucial pointof this aspect is when a recursive call is made or when recursion stops. Another possiblestructural model can be produced precisely by considering this recursive behaviour. Inan analogous way to the de"nition of Prolog schemas, the compulsory elements of thisrecursion-oriented structural model would be the places where recursion is eitherinvoked or stops. Figure 5 gives an example of a recursion points instance. The recursionpoints key segments of this example are the lines where a call to the recursive

}reduction/2

procedure is made.One important point to note is that these structural models for Prolog (techniques,

schemas and recursion points) were proposed for teaching purposes. Their authors donot claim a relationship between them and the Prolog programmer's mental representa-tions. One of the purposes of this paper is to explore whether such a relationship exists.However, the absence or existence of this relationship is not necessarily related to theeducational value of these structural models for Prolog. Therefore, this study does notintend to justify or validate the choice of a speci"c structural model for educationalpurposes.

There has not been any research about focal elements for structural models other thanPlans. It is interesting to investigate if the restructuring of programming knowledge withexperience that Davies refers to only happens for this structural model or whetherknowledge related to other types of information is also restructured as a result ofprogramming experience.

2.3. CODE ASPECTS

A structural organization sometimes highlights a speci"c aspect of the code. Codeaspects refer to the di!erent ways in which a program can be interpreted, or in

FOCAL STRUCTURES IN PROLOG 217

Pennington's words, to the di!erent kinds of information implicit in the programtext. Some of these di!erent kinds of information can be function, data structure,data-#ow and control-#ow. Function refers to what the program does, data structureto the programming language objects that are used in order to implement a solutionto the programming problem. Data-#ow concerns the transformations which dataelements undergo as they are processed and control-#ow refers to the sequence ofactions that will occur when the program is executed. According to Pennington,comprehension of a program involves the detection of several, complementary aspects ofthe code.

Programming Plans, and especially their focal elements, seem to be related to func-tional information. It seems clear that in the previous example about the Plan tocompute an average, the place where the running total is divided by the number of itemsis related to what the Plan is meant to do.

Prolog techniques are concerned with how instantiations of Prolog objects are linkedthrough the program. This characteristic seems to link this model to data-#ow informa-tion, while the stress on well-known data structures and the operations performed overthem make schemas related to data structure information. The example of a Prologschema given in Figure 4 shows a typical operation that a procedure performs over a list,one of the most important data structures in Prolog. Finally, the relationship betweenrecursive behaviour and program execution makes recursion points related to control-#ow. This mapping between structural models and code aspects makes Plans a schema(in the sense of the schema theory) related to function, Prolog schemas a data structureschema, techniques a data-#ow schema and recursion points a control-#ow schema. Inorder to avoid confusion, this paper will prefer the term data structure schema overProlog schemas. This relationship between structural models and code aspects is illus-trated in Table 1.

The experiments described in this paper employ program recall and recognition tasksto "nd out which structural model and related aspect of the code is important to Prologprogrammers. This study adopts the knowledge restructuring theory of program com-prehension by Davies (1993, 1994), but extends it to consider not only function but alsoother code aspects.

3. Which structural model?

The experiment described in this section was concerned with exploring the nature of themental representation of Prolog programmers through "nding out which of several

TABLE 1

Relationship between structural models and code aspects

Structural model Code aspect

Programming Plans FunctionTechniques Data-#owData structure schemas Data structureRecursion points Control-#ow

218 P. ROMERO

structural models considered is most relevant for them. The comparisons were madetaking into account the focal elements of these structural models.

To measure the relevance of a speci"c structural model, the experiment employeda recall task similar to the one in Wiedenbeck (1986). Subjects were asked to understandand memorize a small Prolog program and then recall what they could of it. This recalledcode was analysed in terms of the di!erent structural models of Prolog and theirassociated key segments. The relative success of recollection of the di!erent key segmentswas compared against the relative success of recollection of the rest of the program toestablish the relative relevance of the di!erent structural models for Prolog program-mers. The main di!erence to Wiedenbeck's study is that the present experiment com-pared several structural models for program comprehension, while Wiedenbeck's tookinto account programming Plans only. The structural models taken into account in thisexperiment were Plans, techniques, data structure schemas and recursion points.

In addition to recalling the code, the programmer subjects were asked to describe theprogram's function. The motivation for including this subtask as a part of the experi-mental session was two-fold. First, it reduced the possibility that the programmersubjects would become concerned with memorization, rather than comprehension of thecode. Second, it was a useful method of con"rming the subject's level of skill. Theaccuracy of these descriptions was compared for novice and experienced subjects tocon"rm that there were di!erences between these two groups in terms of their programcomprehension ability.

3.1. AIMS

The aim of this experiment was to "nd out for which model of structural knowledge thereis a di!erence in accuracy of recall between key and non-key segments. This "ndingmight suggest which structural model seems to be more relevant to Prolog programmersand therefore will provide information about the nature and characteristics of Prologprogrammers' mental representations.

3.2. DESIGN

This experiment considered one independent variable, level of programming skill (experi-enced, novice and non-programmer) and nine dependent variables, the success ofrecollection for the key segments and the non-key segments of four di!erent structuralmodels of comprehension (Plans, prolog techniques, data structure schemas and recur-sion points) and the accuracy of function identi"cation by the programmer subjects.

3.3. PARTICIPANTS, PROCEDURE AND MATERIALS

There were 30 subjects: 10 experienced and 10 novice Prolog programmers and a groupof 10 non-programmers. The group of experienced programmers had on average 8.6years of Prolog experience and were either university lecturers or research fellows. Thegroup of novices had taken a 3-month introductory course in Prolog and were eitherundergraduates or masters students. The group of non-programmers were undergrad-uate students who did not know anything about computer programming. The novicepopulation was inexperienced in Prolog, but not in programming in general. Most of

FOCAL STRUCTURES IN PROLOG 219

them knew three or more programming languages apart from Prolog. Also, they oftenhad more recent contact with Prolog via their course than some of the experiencedprogrammers.

This experiment used a control group, the group of non-programmers, because therecall experimental task might confound pure memorization with real comprehension ofthe code.

The novice and experienced programmer subjects of this experiment performed threesimilar sessions. In each session, they were given a hardcopy of the experimental programand were asked to comprehend and memorize it. This study period lasted 3 min. Afterthis, the subjects were given 5 min to recall and write down what they could remember ofthe program. Finally, these subjects used another period of 3 min to write down a shortexplanation of what, according to them, the program did.

The control group of non-programmers followed a slightly di!erent procedure. Theywere not instructed to comprehend but only to memorize the programs. Also, they werenot asked to write down an explanation of what the programs did. In each case, the orderof presentation of the experimental programs was randomized.

There were three experimental programs. These were, a Prolog version of the &&rainfall''program (Davies, 1994), of a bubble sort and a program that performs a binary todecimal conversion. These programs might not be considered as typical examples ofProlog applications because they perform numeric rather than symbolic tasks. Thesewere selected because it was not clear that the novice programmers were familiar witharti"cial intelligence techniques (which are typical of Prolog applications). Figure 6shows the Prolog version of the &&rainfall'' program.

These programs were analysed in terms of key segments of Plans, of data structureschemas, of Prolog techniques and of recursion points according to the de"nitions byRist (1995), Gegg-Harrison (1991), Bowles and Brna (1993) and Le (1993), respectively. Inthe case of data structure schemas, the key segments were considered as the compulsoryelements of Gegg-Harrison's de"nition. The choice of key segments for the case of Prologtechniques was not obvious, so the experiment considered the whole of the instances oftechniques as their key segments. These whole occurrences were similar to the single onehighlighted in Figure 7. The key segments for the case of recursion points were con-sidered as the lines where recursion was invoked or where it stopped. Figure 6 shows theoccurrences of the key segments for the case of Plans and data structure schemas in the&&rainfall'' program.

Some structural models had instances with code segments in common. For example,lines 9, 10, 11, 19, 20 and 24 of Figure 6 represent key segments of Recursion Points.Another example is shown in Figure 7. This "gure shows a segment of the &&rainfall''program with an instance of a Prolog Technique's key segment highlighted. Note thatthese two instances share some elements with some of the instances of data structureschemas shown in Figure 6.

Finally, as the experimental task included the identi"cation of the programs' function-ality, &&disguised'' versions of these programs were presented to the subjects. The criteriato &&disguise'' these programs was similar to the one used by Wiedenbeck (1986). The onlyclues given in these cryptic versions were the type of the variables and an indication oftheir use. For example, these programs followed the conventions of using variables suchas First and Rest for list elements and ¹emp for temporary values.

/* average(-,-) */

1 average (Average,Max):*2 read

}rain(RainList),

3 total}rain(RainList,RainAmount,NumberOfDays,Max),

4 Average is RainAmount / NumberOfDays.

5 read}rain(RainList):*

6 write(&enter data'),7 read(Rain),8 next

}value(Rain,RainList).

9 total}rain([],0,0,0).

10 total}rain([RainDRest],RainAmount,NumberOfDays,Max):*

11 total}rain(Rest,TempRainAmount,TempNumberOfDays,TempMax),

12 RainAmount is ¹empRainAmount#Rain,13 NumberOfDays is ¹empNumberOfDays#1,14 Max(TempMax,Rain,Max).

15 max(Max,Min,Max):*16 Max'"Min.

17 Max(Min,Max,Max):*18 Min(Max.

19 next}value(99999,[]).

20 next}value(Rain,[RainDRest]):*

21 Rain"C"99999,22 write(&enter data'),23 read(NewRain),24 next

}value(NewRain,Rest).

FIGURE 6. A version of the &&rainfall'' program annotated with line numbers and highlighting the focal structuesof Schemas (in bold) and of Plans (in italic).

9 total}rain([],0,0,0).

10 total}rain([RainDRest], RainAmount,NumberOfDays,Max):*

11 total}rain(Rest,TempRainAmount,TempNumberOfDays,TempMax),

12 RainAmount is TempRainAmount#Rain,13 NumberOfDays is TempNumberOfDays#1,14 max(TempMax,Rain,Max).

FIGURE 7. A single instance of a Techniques structure (in bold) in a segment of the &&rainfall'' program.

220 P. ROMERO

3.4. RESULTS

The data of this experiment were analysed in two parts. First, the performance of theprogrammer groups was compared in terms of the accuracy of the identi"cation of theprograms' functions. In the main part of the experimental analysis, the percentage ofaccurate recollection of key and non-key segments was compared for each one of the fourstructural models considered.

In the "rst case, the programmers' statements were considered as correct only if theymentioned the major functions performed by the programs. It was not surprising that the

FOCAL STRUCTURES IN PROLOG 221

experienced programmer group was more successful in identifying the function of theprograms. Their percentage of correct identi"cation was 70%, while for the novices itwas 23.3%. A chi-square test showed that this di!erence was signi"cant [F(1)"15.15,[email protected]].

As mentioned earlier, this "rst part of the analysis was performed to con"rm that therewere di!erences in the degree of understanding of experienced and novice programmers.

The hand-written record of the subject's recollection of the code was the raw data forthe main part of the experimental analysis. This analysis compared the percentages ofrecollection of the occurrences of the di!erent kinds of key segments vs. the percentagesof recollection of the program lines that did not contain occurrences of these keysegments. For example, it can be seen in Figure 6 that the lines 1}8, 12}18 and 21}23 donot share elements with any instances of data structure schemas. Therefore, the analysisfor data structure schemas compared the percentage of recollection of these lines (andsimilar lines in the other programs) with the percentage of recollection of key segments ofdata structure schemas. It has to be noted that in order to mark any segment (key ornon-key) as correctly recalled all of its elements had to be recalled successfully. Figures8}11 illustrate the results of these comparisons for the four kinds of structures.

The statistical analysis for this part of the study focused on the rate of change acrossthe subject groups of the di!erence between the recollection of key segments of structuresand lines outside them for each one of the structures considered. For example, it can beseen that for the case of data structure schemas (Figure 8) this di!erence is negative fornon-programmers and for novices, and positive for experienced programmers. So thestatistical analysis for each structure considered one independent variable (level of skill)and one dependent variable (key}non}key segment percentage of recollection di!erence).For each one of the four comparisons, a one-way ANOVA analysis was run afterverifying that its assumptions had been met. The only case for which this rate of changeamong groups was signi"cant was for data structure schemas [F(2,29)"8.57, p(0.05).

This analysis considered interaction e!ects because, as it can be seen in Figures 8}11,non-key segments were in general recalled better than key segments. This e!ect had to dowith the fact that the di!erent kinds of key segments had di!erent average sizes, and

FIGURE 8. Percentage of recall for key segments of data structure schemas and lines outside them.

FIGURE 9. Percentage of recall for key segments of Plans and lines outside them.

FIGURE 10. Percentage of recall for key segments of techniques and lines outside them.

FIGURE 11. Percentage of recall for key segments of recursion points and lines outside them.

222 P. ROMERO

FOCAL STRUCTURES IN PROLOG 223

some of them were considerably larger than the average line of code. As the criteria toconsider a segment as correctly recalled required that all of its elements had been recalledsuccessfully, the shorter segments normally had a higher recollection percentage than thelonger ones. Another factor that contributed to this disparity in recollection was that thelocation of some key segments in the code was not balanced (some of them tended toappear at the bottom of the program).

Although a direct comparison across key segments of the di!erent structures showedthat again those of data structure schemas were the most relevant for programmers, thiscomparison was not considered reliable because of the disparity in size and location inthe code of the di!erent types of key segments.

3.5. DISCUSSION

The results show that data structure schemas, stereotypical patterns of program-ming procedures related to data structure aspects, seem to become more importantwith experience for Prolog program comprehension. When comparing the di!erencebetween the percentage of recollection of key and non-key segments for eachstructure, data structure schemas were the only case for which the di!erence betweengroups was signi"cant. These results show that data structure schemas become moreimportant for the comprehension process as Prolog programmers develop higher levelsof skill.

Following Brooks' hypothesis, it could be said that the key segments of data structureschemas seem to be the patterns in the code that Prolog programmers use to guide theircomprehension process. This contrasts with procedural languages, where Plans and theirkey segments seem to be important (Wiedenbeck, 1986; Davies, 1994).

The "nding that information related to data structures is important for the compre-hension process in Prolog is in agreement with the results of Bergantz and Hassel (1991).They found that &&data structure relationships play a dominant role at the beginning ofthe comprehension process'' (p. 323) for the case of Prolog. Although the period theyconsidered as the beginning of the comprehension process was approximately 3 timeswhat the present experiment took into account (they gave their subjects 10 min to look atthe code while the participants of the present experiment had 3 min to comprehend theprograms) and the experimental task was di!erent (program modi"cation), the basic"nding is similar.

It is interesting to compare these results with those obtained by Wiedenbeck (1986)because the experiment reported in this paper is similar to hers. She found that focalelements of Plans were relevant for the case of Pascal. Figure 9 can be used to makea closer comparison. With a graph similar to this one, Wiedenbeck shows how functionalinformation is very important only for experienced programmers. Her results were notreplicated in the present study. It could be argued that the experimental programs weredi!erent, but when considering only the bubble sort program, which is similar to the shellsort program Wiedenbeck uses, the results are basically the same as those obtained whentaking into account all three programs. So it seems that the key di!erence is theprogramming language considered, although taking a closer look at this language'sproperties and at the experiment's characteristics might o!er a more precise explanationfor this di!erence in results.

224 P. ROMERO

It seems reasonable to think that in the absence of any other information, patterns oftypical operations performed over familiar data structures will be very important instarting to make sense of the code. Recall that in this experiment there was neitherinternal nor external documentation and variable and procedure names were cryptic.This lack of documentation and meaningful variable names seems to be an importantissue for Prolog. Green, Bellamy and Parker (1997) mention that Prolog, due to its poor&&role-expressiveness'', is specially sensitive to naming style (&Salient variable names arealmost the only method of making a Prolog program &&role-expressive'' and therebyrevealing the Plan structures', p. 142.) An obvious question is how naming style in#uen-ces the mental representation built during program comprehension, or in other words,which aspect of the program (data-#ow, control-#ow, data structure or function) is mostrelevant for programmers when meaningful variable names are available.

Another aspect to consider is the combination of the nature of the recall task andProlog. Recall was particularly useful for this experiment because several structuralmodels could be explored through the same task. However, there are some character-istics of the recall task that have to be looked at in more detail. According to Kintsch(1998), retrieval is an important subtask in recall and material which is easy to associateand organize is more likely to be retrieved and therefore recalled. Key segments of datastructure schemas have a well-de"ned internal organization and a high degree ofindependence from other parts of the program. This is not the case for the key segmentsof other structures. For example, from Figure 6 it can be seen that the key segments ofdata structure schemas can be understood without reference to any variable or proced-ure name external to them, however, this is not the case for the key segments of Plans. Tounderstand how RainAmount is computed, "rst the variables ¹empRainAmount and Rainhave to be traced, but these variables are dependent on other segments of the code,among them a part of a data structure schema key segment. If some of these othersegments of code cannot be recalled correctly, it would be much harder to recall thisparticular Plan key segment. In a way, the data structure schema model slices Prologprograms without &&cutting'' any data-#ow links of the program's variables, creatinga chunk that can be considered as an independent unit and that could therefore beorganized, retrieved and recalled more easily. However, this does not mean that datastructure schemas are necessarily relevant to Prolog programmers, or that other aspectsof the code are not important to them. These other aspects of the code might be related tosegments which are highly associated with other parts of the program, and thereforedependent in terms of retrieval.

The "nding that focal elements of data structure schemas are important for Prologprogram comprehension could be explained by at least three causes (or a combination ofthem). One possible cause is that the comprehension process normally focuses onfunctional information, but the use of cryptic variable names and the &&opacity'' of theprogramming language exacerbated the di$culty of detecting functional information. Itseems that the most relevant aspect (or perhaps the only one) that can be detected inthese circumstances is data structure. This explanation assumes that the aspect of thecode which is accessible to programmers determines what segments of the program areconsidered as focal.

Another possible cause is that recall, as opposed to some other task, favoured keysegments of data structure schemas and blurred the importance of other structural

FOCAL STRUCTURES IN PROLOG 225

models. Finally, it is also possible that comprehension in Prolog is linked to datastructure, and that therefore data structure information is relevant regardless of factorslike naming style and experimental task.

The "rst explanation would predict that data structure aspects are only importantwhen cryptic naming style is considered. When other aspects, function among them, areeasily available in the code, Prolog programmers would not be di!erent from those ofprocedural languages and would take functional information as the most relevant aspectof the code. The second explanation implies that the choice of a relevant aspect wouldchange when considering a di!erent experimental task. In this case, assuming again thatProlog comprehension is not that di!erent from comprehension in procedural lan-guages, a likely outcome is that function is the relevant aspect for most experimentalconditions. The "nal explanation would predict that as data structure is the relevantaspect in Prolog, this information type would be the most important for Prologprogrammers regardless of experimental task and conditions.

These considerations suggest the need for an experiment to control for naming style andwhich modi"es the experimental task. The next section describes such an experiment.

4. A comparison of two structural models

The experiment reported in this section tried to "nd out which structural model,programming Plans or data structure schemas, best characterizes the structural know-ledge of Prolog programmers. Data structure information was considered because theexperiment reported in the previous section indicated that this kind of information isimportant for Prolog programmers. Functional information was considered becausethere is strong empirical support, at least from the procedural languages literature, thatfunction is an important aspect of programming knowledge. It was not possible to takeinto account further structural models because the nature of the experiment would haverequired to recruit a large number of Prolog programmers and given the fact that thislanguage is not popular beyond academic circles "nding many Prolog users would havebeen quite di$cult.

The experiment described in this section considered a recognition rather than a recalltask and compared some characteristics of the comprehension of programs employingvariables and procedures with cryptic and meaningful names. This experiment employeda recognition task because recall, as mentioned above, might confuse the di$culty toassociate and organize program segments with their degree of relevance. Finally, namingstyle was considered as an experimental variable due to its importance for Prologcomprehension.

4.1. HYPOTHESES

The choice of focal elements in program comprehension is a!ected by several factors,programming language and naming style among them. It is a combination of thesefactors that determines what programmers consider as focal elements in a speci"c case.Focal elements related to data structure aspects will be relevant for experimentalconditions that provide a poor naming style. However, no structure will be considered asfocal in the case of a meaningful naming style.

226 P. ROMERO

4.2. DESIGN

This experiment was a 2]2]4 factor design with three independent variables (expertiselevel, naming style and kind of segment) and two dependent variables (response time andsuccess level of recognition). The expertise levels were expert and novice, the namingstyles were meaningful and cryptic and the kinds of segments were key according tofunctional aspects, non-key according to functional aspects, key according to datastructure aspects and non-key according to data structure aspects.

4.3. PARTICIPANTS AND PROCEDURE

The subjects were 20 novices and 20 experienced Prolog programmers. The novicepopulation comprised undergraduate students who had taken a one-term introductorycourse in Prolog. Most of them were second or third year students who knew at least twoother languages apart from Prolog. As in the "rst experiment, the novice population wasinexperienced in Prolog but not in programming in general.

The experienced programmer population had on average 11 years of Prolog program-ming experience and had written programs longer than 3000 lines (on average). Theywere either lecturers or researchers at academic institutions and eight of them had taughta Prolog course.

As the experimental materials of this experiment were similar to those of the previousone, the subjects in these two experiments were di!erent.

The experimental procedure was very similar to the one applied in Davies (1994). Eachprogrammer performed four code recognition sessions. In each of these sessions, theywere presented with a short Prolog program for a period of 3 min. They were asked tostudy and memorize this program. Following this presentation, 16 program segmentswere displayed on the screen, one at a time, and the subjects had to indicate whether thesegment had been in the program they had studied or not. Their answer and responsetime was recorded. The four sessions collected data about meaningful and crypticnaming styles and focal elements of Plans and of data structure schemas. Of the foursessions, two were for programs with cryptic naming style and two for programs withmeaningful naming style. In each session, eight of the program segments that the subjectshad to recognize were related to focal elements of Plans and the other eight to focalelements of data structure schemas. For each of these eight segments, four belonged tothe program they studied and the other four to programs that performed similar tasks.Also, only four of these eight segments were key segments, either from the program theystudied or from the similar versions. The order of presentation of programs with di!erentnaming styles and of segments from di!erent structural models, category and member-ship was randomized. Table 2 presents the layout of an example session. Each subjectperformed four recognition sessions that involved 16 probe items each, so each subjecthad to discriminate in total 64 code segments as belonging to the studied programs ornot.

Additionally, after each recognition session subjects were presented with eight ques-tions relating to functional and data structure aspects of the program they had studied.There were three reasons to include this subtask in the experimental session. First, toprevent the subjects from becoming concerned with pure memorization ratherthan comprehension. Second, to verify that there were di!erences in the quality of

TABLE 2Recognition part of an example experimental session

Recognition Structural Segment Segment No. ofsessions model categorization membership- segments

Key Target 2

Plan Distractor 2

Session 1 Non-key Target 2

Cryptic Distractor 2

naming style Key Target 2

Datastructureschemas

Distractor 2

Non-key Target 2

Distractor 2

Key Target 2

Plan Distractor 2

Session 2 Non-key Target 2

Meaningful Distractor 2

naming style Key Target 2

Datastructureschemas

Distractor 2

Non-key Target 2

Subject Distractor 2

Key Target 2

Plan Distractor 2

Session 3 Non-key Target 2

Cryptic Distractor 2

naming style Key Target 2

Datastructureschemas

Distractor 2

Non-key Target 2

Distractor 2

Key Target 2

Plan Distractor 2

Session 4 Non-key Target 2

Meaningful Distractor 2

naming style Key Target 2

Datastructureschemas

Distractor 2

Non-key Target 2

Distractor 2

-Segment membership refers to whether the segment belonged to the studied program or not.

FOCAL STRUCTURES IN PROLOG 227

228 P. ROMERO

comprehension between the novice and experienced groups and third because there wasan implicit assumption in the formulation of the hypothesis about the type of informa-tion that naming style, among other factors, hides or makes accessible. It was assumedthat one of the e!ects of a cryptic naming style, especially for a language which is verysensitive to this aspect, is that information about function is obscured while informationabout data structure is not. If this is in fact the case, the subjects, especially theexperienced ones, would be able to answer questions about data structure more accu-rately than those related to function only for the case of programs with cryptic namingstyle.

4.4. MATERIALS

There were four experimental programs.The "rst one of these transforms sublists of digitsinto decimal numbers and then sums them up. The second is a version of the bubble sort,the third implements a decimal to binary conversion and the fourth is a Prolog version ofthe &&rainfall'' program. All these programs are around 23 lines in length, performa speci"c calculation on their own and have well-de"ned input and output values. As oneof the interests in the experiment is to compare comprehension of programs withdi!erent naming styles, each one of these programs had two versions, one with meaning-ful procedure and variable names and the other with cryptic naming style for theseelements. Three of these programs were basically the same as those which were used forthe experiment reported in the previous section. The program shown in Figure 6 was alsoused for this experiment.

The four experimental programs were analysed in terms of their key segments of Plansaccording to the de"nition by Rist (1995) and in terms of their key segments of datastructure schemas according to the de"nition by Gegg-Harrison (1991), taking as the keysegments those elements that Gegg-Harrison considers as compulsory. As can be seen inthe program in Figure 6, and as is indeed the case for the other experimental programs,the key segments of Plans and data structure schemas do not have elements in common.It can also be seen that while key segments of Plans normally comprise a single line, keysegments of data structure schemas are usually scattered through several lines of code.

After studying the code for 3 min, subjects were presented with 16 probe segments, oneat a time, to decide whether they belonged to the program they just had seen or not.Some of the code segments for the &&rainfall'' program are shown in Figure 12. Asmentioned before, half of these segments were key and half were non-key. Non-keysegments, either from Plans or from data structure schemas, meant any segments of codewhich were not key for their own kind. Therefore, there were probe segments thatrepresented Plan key segments, Plan non-key segments, data structure schema keysegments and data structure schema non-key segments. A simpler design would havebeen to consider both non-key segment types as one category. However, this was notpossible because Plans and data structure schemas de"ne key segments of di!erent sizesand the experiment had to control for the size of probe items. This meant that probeitems that represented non-key segments of one kind could include elements of keysegments of the other kind. However, even if they comprised some or all of the elementsof a key segment instance of the other kind, they would be classi"ed as non-key segmentprobe items.

FIGURE 12. Probe segments of code for the &&rainfall'' program.

FOCAL STRUCTURES IN PROLOG 229

The segments that did not belong to the displayed programs belonged to programsthat performed similar tasks. For example, the similar programs for the case of the&&bubble sort'' program implemented insertion and quick sorts. Also, every similarprogram employed the same variable and procedure names as its original program as faras possible. For example, the insertion and quick sorts used a procedure name called&&swap'' to perform the insertion or the recursive division of the sorting list, respectively,even though neither of them needed a swap to perform their main task.

To ensure that the di$culty of the recognition task was similar for all the probe items,and therefore that any di!erences in the subjects' performance were due to the type ofprobe items rather than to other factors, the probe items were controlled for severalaspects. These controls ensured that the probe items, when divided into subgroupsaccording to the experimental conditions (key or non-key, data structure schemas orPlans and belonging or not belonging to the displayed program) had a similar recogni-tion di$culty. The criteria to establish this similarity was based on the serial position ofthe code segments in the program, on their length and, for the case of those that did notbelong to the displayed program, on the degree to which they resembled a code segmentof this program. The probe items were evenly distributed with respect to their serialposition in the code, they had roughly the same size and the distractor items hadapproximately the same degree of similarity towards the code of the studied program.This degree of similarity was measured taking into account the number of elements thata distractor item had in common with the program segment that it resembled mostclosely. It is not suggested that this way of measuring the similarity of the distractor itemshas any relationship with their probability of recognition, but at least it ensures that theyhave roughly the same degree of similarity to the actual program segments.

4.4.1. Comprehension questionsAs mentioned in Section 4.3, the accuracy rate for the comprehension questions would beimportant in con"rming the assumption that naming style in#uences the kind ofinformation that is available to programmers and therefore the characteristics of their

1. Is one of the taks of the program to obtain the number of input data items supplied by the user?

2. Does the program "nd out which is the input number with the maximum number of occurrences?

3. Does the main processing of the elements of the list in the total}rain/4 procedure happen in a

back to front order?

4. Is the list in the total}rain/4 procedure processed until a certain element is found?

FIGURE 13. Some comprehension questions for the &&rainfall'' program.

230 P. ROMERO

mental representations. According to this assumption, when faced with programs witha cryptic naming style, programmers resort to basing their comprehension on datastructure aspects. If this is indeed the case, the accuracy rate for questions related tofunction should be less than the accuracy rate of those related to data structure for thecase of cryptic naming style. These rates should be similar for the case of programs withmeaningful naming style.

There were eight yes}no comprehension questions for each program. Half of these werefor functional and half for data structure aspects. Some of the comprehension questions forthe case of the &&rainfall'' program are presented in Figure 13. The "rst two questions referto the functional aspects while the last two are related to the data structure issues.

4.5. RESULTS

The results of the experiment comprise analyses for the case of accuracy and responsetime of both segment recognition and answering of the comprehension questions. Foreach one of these four cases repeated measures ANOVAs were run after verifying thattheir assumptions had been met. For each case, skill level acted as between subjectscondition and naming style, kind of structure (Plans or data structure schemas) andsegment category (key or non-key) as within subjects variables.

4.5.1. AccuracyThe results of the segment recognition accuracy are summarized in Figure 14. In general,experienced users were better than novices in the recognition task, F (1,38)"16.94,p(0.01; and key segments were better recognized than non-key segments,F (1,38)"44.93, p(0.01; but there were interaction e!ects for these two conditions,F (1,38)"13.81, p(0.01. There was an absence of interaction e!ects between segmentcategorization, naming style and kind of structure (data structure schemas or Plans),which means that experts were more accurate in recognizing key rather than non-keysegments regardless of naming style and of the kind of information type.

4.5.2. Response timeThe results for the segment recognition response time show main e!ects for skill level,F(1,38)"4.24, p(0.05; and for segment category, F(1,38)"7.85, p(0.01, indicatingthat in general experts were slower than novices in discriminating code segments and thatkey segments were recognized more quickly than non-key segments. Interaction e!ectsnear to signi"cant for these two conditions showed a tendency for the di!erence in key

FIGURE 14. Probe item recognition accuracy rate by novices and experts. (**), key segments; (- - -), non-keysegments.

FIGURE 15. Probe item recognition response time by novices and experts. (**), key segments; (- - -), non-keysegments.

FOCAL STRUCTURES IN PROLOG 231

non-key segment recognition to be larger for experienced users than for novices,F(1,38)"2.92, p"0.096 (see Figure 15).

The results of the accuracy of comprehension question answering show that expertshad a better accuracy rate, F (1,38)"56.8, p(0.01 and that the meaningful naming stylewas more helpful to all programmers, F (1,38)"7.75, p(0.01.

The results of the response time for the question answering experimental task showthat questions about function were answered faster than questions about data structurefor all the experimental conditions, F (1,38)"5.58, p(0.05.

4.6. DISCUSSION

In general, the results of the experiment do not support the view that naming style isimportant in determining the kind of focal structure and therefore information type

232 P. ROMERO

preferred by Prolog programmers. Both kinds of information, function and data struc-ture, seem to be important for experienced Prolog programmers regardless of namingstyle. Experienced Prolog programmers, unlike novices, seem to make a distinctionbetween key and non-key segments, marking the former as more relevant. This distinc-tion is not a!ected by the kind of key segment considered (data structure schemas orPlans) or by the naming style of the programs they comprehend.

This result is surprising because the hypothesis contained the implicit assumption thatthere could only be one kind of focal element when comprehending a program. Theexperimental results seem to indicate that programmers might consider di!erent kinds ofelements as focal depending on the aspect of the code they are focusing on. If they aretrying to understand data structure issues, one set of program segments would berelevant for them, but if they are trying to decode functional information, they mightconsider a di!erent set of code segments as important.

At least for this experiment, key segments of Plans and of data structure schemas seemto be equally relevant for Prolog programmers. This seems to indicate that function aswell as data structure aspects are important for this programming language and for thistask. However, there might be other aspects also relevant for Prolog programmers. Forexample, it might also be the case that structural models related to aspects such ascontrol-#ow or data-#ow are important and have their own focal structures.

The experiment reported in this section replicated Davies' (1994) results globally, butthere are several di!erences. First, Davies only considered focal elements of Plans, andthe present experiment shows that focal elements might not be restricted to Plans andtherefore to functional aspects, and even more importantly, that several kinds of struc-tures might be considered as relevant by programmers. Another di!erence is that in thisexperiment, experienced programmers were in general slower than novices in the seg-ment recognition task, while in Davies' study experienced programmers were faster thannovices. It could be said that as experienced programmers had a higher accuracy ratethan novices, the response time di!erence might be a speed/accuracy trade o!. However,this does not explain the discrepancy of results with Davies' study.

It has to be said that the experimental settings in Davies (1994) were not the same tothose of the experiment reported here. One of the most important di!erences is thatDavies' subjects had a longer period of time to study the program (10 min as opposed to3 min). This shorter study time was chosen because in pilot sessions, experiencedprogrammers, and even some novices, found 10 min to be too long a time to comprehendthe programs. Three min seemed to be just enough for them. However, it is di$cult tounderstand how a longer exposure to the experimental programs might had madea di!erence in making experienced users faster than novices. Apart from this di!erence,the results support the "nding that key segments are faster to recognize and that there isa tendency for experts to show a more pronounced di!erence in this recognition time.

The results of the question answering task established that there was a di!erence incomprehension performance between experienced and novice Prolog users. Also, thelack of interaction e!ects for kind of structure and naming style seem to be in agreementwith the similar results (no interaction e!ects) for segment categorization, kind ofstructure and naming style for the segment recognition task. The experimental hypothe-sis predicted that naming style would in#uence the kind of information that could bedetected from the program, and that therefore both the recognition and question

FOCAL STRUCTURES IN PROLOG 233

answering task would have di!erent results for di!erent naming styles and informationtypes. Just as the di!erence between key and non-key segments remained constant forboth of these conditions, the question answering accuracy did not vary depending oneither naming style or kind of structure. There were main e!ects for naming style, whichindicate that cryptic programs are more di$cult to understand than meaningful code,but it seems valid to assume that in both cases programmers "nd both kinds of keysegments relevant.

5. General discussion

Both experiments support the notion of knowledge restructuring proposed by Davies(1993, 1994) which states that experienced programmers restructure their structuralknowledge to allow some elements of it to become relevant or salient. However, theresults of these experiments indicate that this knowledge restructuring is not exclusive toPlans and functional information. For the case of Prolog, programmers also seem torestructure their programming knowledge according to at least another type of program-ming information (data structure). It seems also that this restructuring of several aspectsof programming knowledge might produce several kinds of key segments in the code.

The experiments reported here show that information about data structure is impor-tant for Prolog programmers and that they restructure this kind of programmingknowledge as a result of experience. However, one point that needs to be clari"ed is thatthe structural model related to data structure employed in this experiment (PrologSchemas in Gegg-Harrison's terms) has been de"ned only in terms of lists, one of themost important data structures in Prolog. It might be the case that only this speci"c typeof data structure is important for Prolog programmers. However, it seems that othertypes of data structures are important as well. For example, knowledge about how toupdate a numeric variable seems to be very important, and speci"c, to Prolog. Novicesare often surprised when they learn that in general a variable cannot change its value inthis programming language and that in order to do this another variable has to be used.

The "nding that there might be more than one kind of focal element in a programmight seem counter-intuitive at "rst. It would seem that only one aspect should berelevant to programmers. However, the results of these experiments seem to support theidea that comprehension involves the detection of several aspects of the program, andeach of these aspects might have its own organization and hierarchical relations. Whichaspects are relevant and to what degree might depend on the programming language andthe programmer's experience, among other factors. For Prolog programmers, it seemsthat function and data structure are two important aspects, although there might bemore than just these two relevant information types. More experimentation is needed to"nd out whether there are other kinds of focal elements for Prolog. Also, it would beworthwhile performing a similar investigation for procedural languages. Although theevidence suggests that focal elements in these languages are related to function, thestudies that investigated this issue assumed that there were no other types of focalelements.

One issue that remains to be explained is the reason for the discrepancy in theexperimental results concerning Plan focal structures. In the "rst experiment, the onlyrelevant key segments were those of data structure schemas, while in the second

234 P. ROMERO

experiment both key segments of data structure schemas and of Plans seemed to berelevant. Both experiments had similar settings and the programmer subjects had similarexperience and background. One of the di!erences between these two experiments wasthe nature of the experimental task. The "rst experiment used a program recall task whilethe second applied a program recognition task. As mentioned in Section 3.5, according toKintsch (1998), one important di!erence between these two memory tasks is that recall,unlike recognition, is sensitive to material that can be associated and organized clearly.

Key segments of Plans, unlike those of data structure schemas, do not seem to havea well-de"ned internal organization because they are dependent on other parts of theprogram. If these other parts of the program could not be recalled successfully, it wouldbe harder for key segments of Plans to be recalled accurately. In the second experiment,on the other hand, Plan key segments could be recognized even when other parts of theprogram necessary for their complete understanding were not present. So the disparity ofresults could be explained by a combination of the experimental task and the hiddendependencies [in terms of Green's (1999) cognitive dimensions] of Prolog. This combina-tion of task and Prolog characteristics might also be playing a role in the lack of successof other studies that have tried to "nd evidence of the importance of functionalinformation for this language. Studies of Prolog code generation by Bellamy andGilmoree (1990) and Ormerod and Ball (1993) did not "nd any evidence of functional keysegment-oriented strategies similar to those reported by Rist (1989) and Davies (1991) forPascal programmers. Rist and Davies showed that focal elements of Plans played animportant role in the way Pascal programmers write code. It might be the case that thestrong dependencies of functional key segments in Prolog obscured their importance forcode generation.

The global results of the study indicate that data structure schemas are a plausiblemodel of structural knowledge for Prolog, but functional information, and Plans, areimportant as well for Prolog programmers. This result does not rule out the possibilitythat there might be further aspects of the code that are important for this programminglanguage.

6. Conclusions

This paper has explored which programming aspect seems to be most relevant for Prologprogrammers, adopting the model of structural knowledge organization proposed byDavies (1993, 1994), but extending the notion of focal elements to consider not onlyfunction but also other programming information types.

The study comprised two experiments. The "rst one was of an exploratory nature andtried to "nd out which one of four programming knowledge structures seems to be mostimportant for Prolog programmers when doing program comprehension. It seems thatinformation related to data structures was important in this case.

The second experiment tried to replicate the results of the "rst one, but applieda recognition instead of a recall task taking into account only two kinds of informationtypes, data structure schemas and Plans. The results of this second experiment indicatethat at least for the case of Prolog, experienced programmers restructure the organiza-tion of both their Plan and schema knowledge making certain elements of thesestructures become more relevant than others.

FOCAL STRUCTURES IN PROLOG 235

The reason for Plan information being apparently not so important in the "rstexperiment seems to be due to the di!erence in the two experimental tasks used in thisstudy, recall and recognition. It has been argued that recall facilitated high rates ofaccuracy for data structure schemas but not for plans. If the experimental task is indeedof prime importance to explain these experimental results then more experimentationwith a variety of tasks is needed to con"rm them. Considering di!erent tasks is ofparticular importance when looking at the relevance of focal elements in programcomprehension because the studies that have investigated this relationship consideredmemorization-like tasks only. Although the results of these experiments are encouraging,it is not known if knowledge about focal elements is important for practical program-ming tasks such as debugging or reuse. An experiment in progress is investigating thisissue for the debugging task. According to Davies, errors located in the program's keysegments will be easier to spot than errors elsewhere. Although this prediction isexpected to be true for Prolog, it is not known which type of focal elements, functional ordata structure, will be relevant for the debugging task. It is also possible that some otherstructural models and information types besides Plans and data structure schemas mightbe important for Prolog programmers when debugging.

Another important issue concerns the generality of the results in terms of the full rangeof data structure types in Prolog. The only example of data structure considered in theexperiments is lists. Although lists are one of the most important data structures for thislanguage, further experimentation should consider other data structure types.

Several studies have suggested that experienced programmers construct several viewsof the program, according to the di!erent types of information implicit in the programtext (Pennington, 1987; Davies et al., 1995; Wiedenbeck & Ramalingam, 1999). Theexperiments reported here suggest that their structural knowledge for each of theseaspects of the code shows an organization in which speci"c elements are more relevantthan others. This would imply that the notion of focal elements involves not onlyfunction but other types of programming information.

The author would like to express his thanks to Ben du Boulay of COGS, Sussex University for hishelp in the preparation of this paper, to Thomas Green of CBL, Leeds University, for his advice inthe design of the "rst experiment and to Judith Good of HCRC, Edinburgh University, and ChrisTaylor, of COGS, Sussex University, for their help in contacting Prolog experienced users. Thanksalso to all the subjects for their time and patience.

References

BELLAMY, R. K. E. & GILMORE, D. J. (1990). Programming plans: internal and external structures.In K. GILHOOLY, M. T. G. KEANE, R. H. LOGIE & G. ERDOS, Eds. ¸ines of ¹hinking:Re#ections on the Psychology of Thought, Vol. 1, pp. 59}71. London, Wiley. UK.

BERGANTZ, D. & HASSELL, J. (1991). Information relationships in PROLOG programs: how doprogrammers comprehend functionality?, International Journal of Man}Machine Studies., 35,313}328.

BOWLES, A. & BRNA, P. (1993). Programming Plans and programming techniques. In P. BRNA,S. OHLSSON & H. PAIN, Eds. =orld Conference on Arti,cial Intelligence in Education, pp.378}385. Edinburgh, UK. Association for the Advancement of Computing in Education.

BRNA, P., BUNDY, A., DODD, T., EISENSTADT, M., LOOI, C. K. & PAIN, H. (1991). Prologprogramming techniques. Instructional Science, 20, 111}133.

236 P. ROMERO

BROOKS, R. (1993). Towards a theory of the comprehension of computer programs. InternationalJournal of Man}Machine Studies, 18, 543}554.

DAVIES, S. P. (1990). The nature and development of programming plans. International Journal ofMan}Machine Studies, 32, 461}481.

DAVIES, S. P. (1991). Characterising the program design activity: neither strictly top-down norglobally opportunistic, Behaviour and Information ¹echnology, 10, 173}190.

DAVIES, S. P. (1993). Models and theories of programming strategy. International Journal of Man-Machine Studies, 39, 237}267.

DAVIES, S. P. (1994). Knowledge restructuring and the acquisition of programming expertise.International Journal of Human Computer Studies, 40, 703}726.

DAVIES, S. P., GILMORE, D. J. & GREEN, T. R. G. (1995). Are objects that importanrt? e!ects ofexpertise and familiarity on classi"cation of object-oriented code. Human}Computer Interac-tion, 10, 227}248.

DED TIENNE, F. (1990). Expert programming knowledge: a schema-based approach. In J. HOC,T. R. G. GREEN, R. SAMURCAY & D. J. GILMORE, Eds. Psychology of Programming,pp. 205}222. London, UK: Academic Press.

GEGG-HARRISON, T. S. (1991). Learning Prolog in a schema-based enviornment. InstructionalScince 20, 173}192.

GILMORE, D. J. & GREEN, T. R. G. (1988). Programming Plans and programming expertise.Quarterly Journal of Experimental Psychology, 40A, 423}442.

GREEN, T. R. G. (1999). Building and manipulating complex information structures: issues inProlog progamming. In P. BRNA, B. DU BOULAY & H. PAIN, Eds. ¸earning to Build andComprehend Complex Information Structures: Prolog as a Case Study, pp. 7}28. Stamford, Ct:Ablex.

GREEN, T. R. G., BELLAMY, R. K. E. & PARKER, J. M. (1987). Parsing and Gnisrap: a model ofdevice use. In G. M. OLSON, S. SHEPPARD & E. SOLOWAY, Eds. Empirical Studies ofProgrammers, Second=orkshop, pp. 132}146. Norwood, NJ: Ablex.

KINTSCH, W. (1998). Comprehension: a Paradigm for Cognition. Cambridge, UK: CambridgeUniversity Press.

LE, T. V. (1993). ¹echniques of Prolog Programming with Implementation of ¸ogical Negation andQuanti,ed Goals. New York: John Wiley & Sons.

ORMEROD, T. C. & BALL, L. J. (1993). Does design strategy or programming knowledge determineshift of focus in expert Prolog programming? In Empirical Studies of Programmers, Fifth=orkshop, pp. 162}186. Norwood, NJ: Ablex.

PENNINGTON, N. (1987). Stimulus structures and mental representations in expert comprehensionof computer programs. Cognitive Psychology, 19, 295}341.

RIST, R. S. (1989). Schema creation in programming. Cognitive Science, 13, 389}414.RIST, R. S. (1995). Program structure and design. Cognitive Science, 19, 507}562.SCHANK, R. C. & ABELSON, R. (1977). Plans, Goals and ;nderstanding. Hillsdale, NJ: Erlbaum.WIEDENBECK, S. (1986). Processes in computer program comprehension, In E. SOLOWAY, & S.

IYENGAR, Eds. Empirical Studies of Programmers, First=orkshop, pp. 48}57. Norwood, NJ:Ablex.

WIEDENBECK, S. & RAMALINGAM, V. (1999). Novice comprehension of small programs written inthe procedural and object-oriented styles. International Journal of Human Computer Studies,51, 71}87.