6
Scientific Programming 20 (2012) 349–353 349 DOI 10.3233/SPR-2012-0348 IOS Press Book Review Dan Nagle E-mail: [email protected] Scientific Software Design: The Object-Oriented Way, by Damian Rouson, Jim Xia and Xiaofeng Xu (authors), Cambridge University Press, 2011, ISBN 978-0-521-88813-4, 404 pp., hardcover. 1. Who are we, anyway? What is computational science? We have all heard the cheerleading. Computational science is the third great leg of science, to take its rightful place alongside observation and experiment, on the one hand, and the- ory, on the other. But what has computational science really got that makes it a third way? Is it not just ap- plied theory, computing the consequences of an equa- tion? Is it not simply mining observations or control- ling experiments and recording the results? Is there really a body of knowledge, distinct from any field of application, that stands alone? Describing such a body of knowledge is where the book described in the fol- lowing stands. 2. A body of craft Scientific Software Design, subtitled The Object- Oriented Way, is 382 pages and includes a Preface, 12 chapters (in three sections), two appendices, a Bib- liography and an Index. The authors are from Sandia National Laboratories, IBM Canada Labs, and General Motors, respectively. As the title indicates, this book is about software design, particularly the design of scien- tific software. The examples involve solving differen- tial equations (both ODEs and PDEs). Computational Science departments, where they exist as separate en- tities, or where they are lumped together with other sciences in a department of this, that, and computa- tional science, must work to explain the differences be- tween themselves and the established Computer Sci- ence department (whether lumped together with this or that engineering). Perhaps this contributes to an em- phasis on managing large datasets (for example, the output of a large telescope or particle collider), or solv- ing differential equations on complex grids (for exam- ple, fluid flow in ever-more realistic settings), rather than on old fashioned software engineering. These au- thors set themselves the task of addressing this per- ceived gap. We sally into the Preface. With the laws of physics in our current universe being what they are, high per- formance computing means highly-scalable comput- ing. These authors mean to lead us on a track which, they claim, is nearly orthogonal. They want to dis- cuss highly-scalable design. What is that? With more and more science being multi-science, advances are happening along the boundaries among various disci- plines. So several experts are involved. As the code grows, it will become more complex, with no one au- thor able to ride herd on all of it. One consequence is that the design must support clearly understood inter- faces so interoperability will be effective in the long haul. The social side of this situation is that every ex- pert in one topic is a non-expert in other topics, who nevertheless must be able to understand the overall computational scheme, even if only to integrate each expert’s own contribution into the mix. And with sci- ence being done in the interdisciplinary gaps, mix it will be. We also understand that greater mix means computationally greater size and complexity. Not only must the scientists work together effectively, so must their code contributions. The contributions of the var- ious disciplines are being combined, not displaced. From this perspective, we approach Part 1, The Tao of Scientific OOP. Chapter 1, Development Costs and Complexity, starts us with the notions of multiscale and multi- physics. The authors deplore their perceived lack of discussion of software architecture in today’s compu- tational science education. They illustrate with an ex- ample. Start with the heat equation, a model of physi- cal reality. From here, one derives a oft-stated problem of the heat flow along a conductor, in this example case the heat sink on a processor chip. Next, a discretiza- tion scheme is applied to produce equations suitable 1058-9244/12/$27.50 © 2012 – IOS Press and the authors. All rights reserved

Scientific Programming 20 (2012) 349–353 349 DOI 10.3233 ...downloads.hindawi.com/journals/sp/2012/639095.pdf · operators or Fortran user-defined assignment, we can make our

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Scientific Programming 20 (2012) 349–353 349 DOI 10.3233 ...downloads.hindawi.com/journals/sp/2012/639095.pdf · operators or Fortran user-defined assignment, we can make our

Scientific Programming 20 (2012) 349–353 349DOI 10.3233/SPR-2012-0348IOS Press

Book Review

Dan NagleE-mail: [email protected]

Scientific Software Design: The Object-OrientedWay, by Damian Rouson, Jim Xia and Xiaofeng Xu(authors), Cambridge University Press, 2011, ISBN978-0-521-88813-4, 404 pp., hardcover.

1. Who are we, anyway?

What is computational science? We have all heardthe cheerleading. Computational science is the thirdgreat leg of science, to take its rightful place alongsideobservation and experiment, on the one hand, and the-ory, on the other. But what has computational sciencereally got that makes it a third way? Is it not just ap-plied theory, computing the consequences of an equa-tion? Is it not simply mining observations or control-ling experiments and recording the results? Is therereally a body of knowledge, distinct from any field ofapplication, that stands alone? Describing such a bodyof knowledge is where the book described in the fol-lowing stands.

2. A body of craft

Scientific Software Design, subtitled The Object-Oriented Way, is 382 pages and includes a Preface,12 chapters (in three sections), two appendices, a Bib-liography and an Index. The authors are from SandiaNational Laboratories, IBM Canada Labs, and GeneralMotors, respectively. As the title indicates, this book isabout software design, particularly the design of scien-tific software. The examples involve solving differen-tial equations (both ODEs and PDEs). ComputationalScience departments, where they exist as separate en-tities, or where they are lumped together with othersciences in a department of this, that, and computa-tional science, must work to explain the differences be-tween themselves and the established Computer Sci-ence department (whether lumped together with this orthat engineering). Perhaps this contributes to an em-phasis on managing large datasets (for example, the

output of a large telescope or particle collider), or solv-ing differential equations on complex grids (for exam-ple, fluid flow in ever-more realistic settings), ratherthan on old fashioned software engineering. These au-thors set themselves the task of addressing this per-ceived gap.

We sally into the Preface. With the laws of physicsin our current universe being what they are, high per-formance computing means highly-scalable comput-ing. These authors mean to lead us on a track which,they claim, is nearly orthogonal. They want to dis-cuss highly-scalable design. What is that? With moreand more science being multi-science, advances arehappening along the boundaries among various disci-plines. So several experts are involved. As the codegrows, it will become more complex, with no one au-thor able to ride herd on all of it. One consequence isthat the design must support clearly understood inter-faces so interoperability will be effective in the longhaul. The social side of this situation is that every ex-pert in one topic is a non-expert in other topics, whonevertheless must be able to understand the overallcomputational scheme, even if only to integrate eachexpert’s own contribution into the mix. And with sci-ence being done in the interdisciplinary gaps, mix itwill be. We also understand that greater mix meanscomputationally greater size and complexity. Not onlymust the scientists work together effectively, so musttheir code contributions. The contributions of the var-ious disciplines are being combined, not displaced.From this perspective, we approach Part 1, The Tao ofScientific OOP.

Chapter 1, Development Costs and Complexity,starts us with the notions of multiscale and multi-physics. The authors deplore their perceived lack ofdiscussion of software architecture in today’s compu-tational science education. They illustrate with an ex-ample. Start with the heat equation, a model of physi-cal reality. From here, one derives a oft-stated problemof the heat flow along a conductor, in this example casethe heat sink on a processor chip. Next, a discretiza-tion scheme is applied to produce equations suitable

1058-9244/12/$27.50 © 2012 – IOS Press and the authors. All rights reserved

Page 2: Scientific Programming 20 (2012) 349–353 349 DOI 10.3233 ...downloads.hindawi.com/journals/sp/2012/639095.pdf · operators or Fortran user-defined assignment, we can make our

350 D. Nagle / Book Review

for digital computation. And code may be developed tobe executed on a digital computer to solve the originalproblem. Any student in a computational science pro-gram could go this far, perhaps as a homework prob-lem. But, the authors remind us, the programmer’s timeis more valuable than the computer’s time. A simpleexample with a postdoc and a departmental cluster putsrealistic numbers on the money involved to prove thepoint. So the discussion takes us to code complexityto show us that, while the numerical complexity of thesolution scheme can be estimated, the complexity ofthe code maintenance and code development, all toooften, is not considered. In short, how many lines ofcode must be checked to find a bug? This is the mainexpense of fixing the bug. A simple solution of our heatproblem provides the example. Language constructs,such as modules, pure functions, and argument intents,are helpful by limiting the lines of code that must bechecked for any search. Here, the authors’ penchantfor mixing metaphors comes to the fore, Ravioli TastesBetter Than Spaghetti Code. Momentarily, we divert toa history of computer science. We start with assemblycode, with its unbounded jumps. Then, fifty years ago,Fortran put some bounds on things, with defined loopsand choice blocks (if-blocks did not arrive until For-tran 77, but that is a nitpick – loops were there from thebeginning and anyway the argument is sound). Veeringpast whether jumps should be considered harmful, wesightsee along the structured programming vista beforearriving at object-oriented programming. While keep-ing code to small code segments has not been shownto reduce the overall count of bugs, it does neverthe-less reduce the amount of code needing to be checkedwhen the symptoms of a bug are seen. The 90–10%rule, which the authors call the Pareto Principle andquote as 80–20%, is next. Thus, we may write mostof our code to be readable by ourselves, as the bottle-neck is a small portion of the overall software. And thekernel is likely supported by a tuned library procedure,so we will not ourselves write even all of the kernel.And writing modular, object-oriented code will helpus parallelize our code when time comes to care aboutperformance. The authors finish this first chapter witha presentation of their coding style, and show us thatFortran and C++, as done here anyway, appear rathersimilar.

So oriented, we venture into Chapter 2, The ObjectOriented Way. We start with a Rosetta Stone, wherenames of concepts taken from Unified Modeling Lan-guage (UML) are related to their near-synonyms inC++ and Fortran. So the UML Abstract Data Type

(ADT) is related to the C++ class and the Fortran de-rived type. It is a long list. Feeling bound to begin with“Hello, world!” we find a structured program of fivelines that does the trick. It appears on the same pagean object-oriented program of 33 lines to the same ef-fect. What was done? Or rather, why? We have builta solid wall defending the data of each step of estab-lishing data, actor and action. With this, modificationsmay be done with greater confidence. And the countof lines to check to find bugs can be substantially re-duced. One could apply the techniques employed towrap some old, yet trusted code to be used in a newprogram, without fear of contamination across the bor-der. We take our object-oriented analysis, fortified withUML diagrams, and create an object-oriented designfor our heat problem. We visit composition, aggrega-tion and inheritance. Some advantages of inheritanceare shown when we wish to modify our heat programto use more general heat distributions. That was easy!Which is the point. We encounter unit testing, a ben-efit of our data-isolating style. A short discussion ofstatic inheritance versus dynamic inheritance brings usto a revisiting of the complexity analysis we saw ear-lier. And all this was done in Fortran, with public andprivate attributes, only clauses on use statements, andpure functions sporting intent-in arguments. The inter-module barrier holds, and can be seen to hold.

Of course, a scientific software project will grow,and this is what happens in Chapter 3, Scientific OOP.(Scientific OOP will be written SOOP, pronounced,I presume, soup.) We meet Abstract Data Type Calcu-lus. Using C++ overloaded operators or Fortran user-defined operators, and C++ overloaded assignmentoperators or Fortran user-defined assignment, we canmake our software resemble our blackboard notes. Weexplore forward Euler quadrature followed by a back-ward Euler quadrature. Once we have methods to com-pute a time derivative, either quadrature can be codedquickly, along with the confidence-giving test caseswe now know we want. Object Oriented Analysis andObject Oriented Design are working. More integrationschemes are easily added to our heat code. How easy isit? Well, how hard is it? And we are lead to design met-rics. We meet the concepts of cohesion and coupling,followed by the notions of an ADT’s afferent couplingsand its efferent couplings. Hence, we visit complexitytheory. Exact results are not available, but we make dowith computed bounds. Next, we visit information the-ory, and ask how much must be known for two classesto interact? That is, how much must two programmerscommunicate to work together on the same program?All of which leads us to the Tao of SOOP.

Page 3: Scientific Programming 20 (2012) 349–353 349 DOI 10.3233 ...downloads.hindawi.com/journals/sp/2012/639095.pdf · operators or Fortran user-defined assignment, we can make our

D. Nagle / Book Review 351

Having mastered the basics, our journey continuesinto Part II, SOOP to Nuts and Bolts (yes, by nowone has learned not to mind the mixed metaphors).So we find ourselves starting Chapter 4, Design Pat-tern Basics. Design patterns, we learn, originated inarchitecture. We take the lessons learned there toprogramming. We have a bit of a history lesson of howarchitectural ideas came into computer science. So fol-lowing the path of architect Chris Alexander in his APattern Language books, and determined to use designpatterns, we encounter the three problems that serve asexamples in the remainder of Part II, the Lorenz Equa-tions, whose chaos allows us to test our computed solu-tions, quantum vortices in superfluids described by theBiot–Savart law and Burgers equation. So prepared, weventure onwards.

The next chapter, Chapter 5, The Object Pattern, dis-cusses the general object. It starts with a discussionof just how rigid to make the design of a large pro-gram. Which details to specify? If one is too lax, thecode will lose understandability; if too tight, one mightpress needless constraints, hindering some efforts andwasting others. The object pattern takes its name fromJava, where all entities are descendants of the objectclass. The object class holds codifications of propertieswe want all objects to have, such as construction anddestruction. Our example for this chapter is the vortexproblem described in the previous chapter. It is inter-esting due to the desire to have an object represent eachvortex, which must be linked together to form the ring.Since the vortexes occur in a ring, and one would likea pointer from vortex to vortex, one has an interestingissue: how to add or remove vortexes? The data struc-ture is not a list yet has a referent from the outside.Once one’s code is traversing the ring, one has end-less traversal. (Visualize a figure 6. Descending fromthe top along the riser, one is following an outside ref-erent. Once in the loop at the bottom, one continuesin endless cycles.) Pointers are used in either Fortranor C++, but the usual object constructor or destruc-tor may not be used. Thus, one has a dangling pointerproblem (where the object targeted by the pointer is nolonger valid, yet the pointer remains). How to avoid thememory leak? The object pattern is used to supplementthe language-supplied pointer to provide safe pointermanipulation. In either language, a more elegant solu-tion could be employed, but the authors sacrifice forpedagogic purposes, keeping the solution simpler, andtheir point is clearly made.

Moving to Chapter 6, The Abstract Calculus Pat-tern, we see how to build our divisions between mod-

ules more clearly. The Lorenz problem, those three lit-tle scalar coupled differential equations demonstratingthe attractor, will be our focus. This time, we write ab-stract classes to supply the integration services. Ouruser will write to the abstract classes. We then imple-ment the abstract class with a Lorenz class containingthe particulars. Our user does not even see our actualprocedures, only their signatures. The solutions in ei-ther C++ or Fortran are strikingly similar. The biggestdifference being that Fortran uses a default clause ina select type block where C++ attempts a dynamiccast within a try block, both languages guard to catcha class that cannot be integrated.

In Chapter 6, we supplied a service, integratingthe Lorenz equations, without exposing the proceduresthat did so, we only exposed their signatures. Thus, wecan change the procedure we supply during execution.That is what we do in Chapter 7, The Strategy and Sur-rogate Patterns. Now, we integrate the Lorenz equa-tions, and we can choose an integrator from among aselection provided. As the design becomes more ab-stract, the C++ and Fortran expositions of the designare, surprisingly perhaps, converging. Here, the majordifference is that the Fortran version must use an auxil-iary class to implement the Surrogate Pattern, which isunneeded in C++ due to its forward attribute. On theother hand, the C++ version uses reference-countedpointers where Fortran can gain the same effect withits allocatable variables.

Pressing bravely into Chapter 8, The Puppeteer Pat-tern, we encounter the next level of complexity. Re-call that we are addressing the complexity that arisesfrom multi-physics simulations. Our weapon of choiceis to safely encapsulate local data within the definitionsof the software representing the abstractions involved.What happens when a non-linear problem requires datafrom within each abstraction to use an implicit methodto advance to the next time step? We employ a Pup-peteer, which acts the role of the broker who knows allthe actors (considered to be the puppets), and allowsthem to cooperate without knowing each other. Again,the C++ implementation and the Fortran implementa-tion are remarkably similar.

Chapter 9, Factory Patterns, takes the encapsula-tion process one step further. In the non-linear problemsolved in Chapter 8, we had to allow objects of oneclass, the puppeteer class, to have access to data that wewould like to be private within another class (the pup-pets). Can this be avoided? Indeed it can, and we learnthat the factory pattern shows the way. Our example issolving Burgers equation. This time, all the classes in-

Page 4: Scientific Programming 20 (2012) 349–353 349 DOI 10.3233 ...downloads.hindawi.com/journals/sp/2012/639095.pdf · operators or Fortran user-defined assignment, we can make our

352 D. Nagle / Book Review

volved will be unknown to each other except via theirparent classes. The parent classes expose the signaturesof the procedures to be publicly known. The actual pro-cedures are not known outside the child classes at all.Instances of a factory class can create new instancesof the working classes as needed. Our separation tech-niques are functioning well. And we can advance toPart III, Gumbo SOOP.

Chapter 10, Formal Constraints introduces us to theObject Constraint Language (OCL). This language isaimed towards bridging the gap between working sci-entists, who are the subject matter experts of scien-tific computing, and experts in formal methods, thelanguage whereby program correctness is to be ex-pressed. Our example is a solution of Burgers equationexpressed in Fortran. OCL lacks a good expression ofarrays, so we must explore how to do so before pro-ceeding to discuss memory management. Fortran spec-ifies that the arguments of defined operators must notbe changed by the pure procedure defining the opera-tion, so if the results of nested operators are pointers,memory leaks can develop especially in complicatedexpressions with many partial results. This point, andFortran’s lack of a language-defined assertion mecha-nism, are the subject here. Solutions are provided, butthe language could provide more help to good effect.

So we move into Chapter 11, Mixed-Language Pro-gramming. The context for the discussion is the Trili-nos package. The goal is to link the Fortran bindingwith the C++ software. The strategy is to use Fortran’sInteroperability with C to define the binding. Trilinosuses an object oriented design, so linking the of C++to C to Fortran poses the issue that C does not directlysupport object oriented designs. Nevertheless, the au-thors describe a solution, which allows both the C++side and the Fortran side to use object oriented featureswhile communicating through C.

And so we arrive at the ultimate chapter, Chapter 12,Multiphysics Architectures. This brings us to the pointwhere multiphysics meets multiprocessor. We profileour Burgers solver. Next, we dutifully salute Amdahl’sLaw one more time, this time with the greater detailof a multipart program. We review multithreading onlyto find refuge in directive-based OpenMP. Our next ef-fort lands us in the war zone of message-passing, com-plete with conditional compilation, via cpp, to removethe library calls when undesired. We finally come tomeet coarrays. Unlike library code, coarrays are fullyintegrated into the language, so they fit into derivedtypes without jumping through hoops and fully partici-pate in other common language features (allocation, in-

put/output and so on). Our goal is to compute solutionsto scientific problems, so we meet the multiphysics.Quantum turbulence, lattice-Boltzmann biofluid dy-namics, particle dispersion in magnetohydrodynamics,and radar scattering in the atmospheric boundary layerserve as example problems. Each requires cooperationfrom a variety of disciplines. We arrive at the MorfeusFramework, an open-source project intended to aid ourefforts at solving these and similarly complex calcula-tions.

We have reached the Appendices. The first coversmathematics. The material would be at home in a first-year graduate course in Computational Science. Thesecond introduces the reader to the Unified ModelingLanguage (UML). The authors make good use of theUML, the reader lacking familiarity with it could prof-itably read it before starting the chapters. Other readersmight prefer to find a book on UML before venturinginto the main text.

3. The skills of growth

So where have we been on this journey? The au-thors start us with the proposition that binding soft-ware into independent, self-contained and thereforereusable, segments is good for software development.Many professional programmers, making, for exam-ple, office software, would do little more than yawn inreaction. The authors have then proceeded to show usthat, as the scientific complexity of our computationalefforts grows, so the need for independence of scien-tific modules grows as well. This is due to the degreeof detail to be included in our computations within onediscipline, but also due to the interdisciplinary needs ofa single analysis. The authors have also shown that, farfrom being at odds with the needs for parallel process-ing implied by the magnitude of operations counts andthe size of datasets, properly handled independence ofmodules works synergistically with the forms of com-putation needed by parallel processors. This is indeedgood and useful news. The techniques shown here canindeed form the inspiration for one’s own design (orredesign). Indeed, there are several instances abovewhere I wrote of what the programmer could or shoulddo, when I could (should?) have written of the scien-tist instead. And that’s the point: if computational sci-ence is a distinct discipline, the scientist must becomea programmer.

New hardware today moves from commercial or en-tertainment uses towards scientific use. The “attack of

Page 5: Scientific Programming 20 (2012) 349–353 349 DOI 10.3233 ...downloads.hindawi.com/journals/sp/2012/639095.pdf · operators or Fortran user-defined assignment, we can make our

D. Nagle / Book Review 353

the killer micros” was first heard some 25 years ago.And today, 64-bit parallel hardware is developed forgamers and home viewing of movies. That’s wherethe big money is, after all. Are software developmenttechniques moving from commercial software devel-opers to scientific software developers, too? The an-swer must be “yes” but certainly it’s happening moreslowly than one might like. Confidence in long-usedand well-tested software, and a desire for continuity ofpublished work conspire to keep the “if it ain’t broke,don’t fix it!” attitude alive and well. ComputationalScience is new to many campuses, so practitioners arestill resolving just what the departmental identity is.Surely the focus must be on differential equations andlarge databases, with perhaps some graphics for un-derstanding and for communication to colleagues. So

perhaps issues of software engineering are still in thepending tray.

These authors have shown that increasing complex-ity, due to increased details of the calculation, in-creased variety of contributors to bring together anincreased variety of expertise, and the increasinglycomplex demands of parallel processing for execution,must be addressed, and they have gone a long wayat least towards showing us how to do so. Networkedopen source scientific software projects may furtherconfuse things by leaving some inter-programmercommunications in batch mode. Scientific softwaremust be consciously designed to grow with a researchprogram and the hardware that supports the researchprogram. And how to do that is precisely what theseauthors in this book have shown.

Page 6: Scientific Programming 20 (2012) 349–353 349 DOI 10.3233 ...downloads.hindawi.com/journals/sp/2012/639095.pdf · operators or Fortran user-defined assignment, we can make our

Submit your manuscripts athttp://www.hindawi.com

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttp://www.hindawi.com

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation http://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Applied Computational Intelligence and Soft Computing

 Advances in 

Artificial Intelligence

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Modelling & Simulation in EngineeringHindawi Publishing Corporation http://www.hindawi.com Volume 2014

The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014