24
ICS Newsletter Contents 1 A Science in Trouble 3 Message from the Editors 3 What about parallel program- ming and high-end systems? 8 ICS Member Profile: Richard E. Rosenthal 17 ITORMS: Call for Papers 17 GeNIe 1.0 18 Message from the Chair 19 New Books 21 Journal on Computing: Vol. 10, No. 4 21 Fred Glover Wins von Neumann Prize 22 Seventh INFORMS Comput- ing Society Conference 23 News about Members 23 Upcoming Meetings Volume 19, Number 2 Fall 1998 Computing Society A Science in Trouble Asim Roy Arizona State University Can a whole body of science simply unravel when confronted by a few simple challenges? Can a large body of scientists overlook some very simple facts for a long period of time? From the current debate on how the brain learns, the answer appears to be “yes” for at least one body of science – artificial neural networks or connectionism. A sampling of some recent comments might be a good indicator of this. The first open and public admission that much of the existing science is wrong came from Christoph von der Malsburg, a German neurobiologist and computer scientist affiliated with both the Ruhr University of Germany and the University of Southern California. In commenting on the challenge I posed, which claimed that neural networks do not embody brain-like learning, he remarked, ”I strongly believe that the current paradigm of neural network learning misses very essential aspects of the learning problem, and I totally concur with the assessment, expressed in your expose, that specific prior knowledge is required as a basis for learning from a given domain...I am glad the issue seems to start picking up momentum. Some Kuhnian revolution is required here, and as he (T. Kuhn) wrote, such scientific revolutions are always trouble: Continued on page 10

ICS Newsletter - Institute for Operations Research and the

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ICS Newsletter - Institute for Operations Research and the

1234567890123456789012345678901212345678901234567890123456789012345678901234567890123456789012123456789012345678901234567890123456789012345678901234567890121234567890123456789012345678901234567890123456789012345678901212345678901234567890123456789012345678901234567890123456789012123456789012345678901234567890123456789012345678901234567890121234567890123456789012345678901234567890123456789012345678901212345678901234567890123456789012345678901234567890123456789012123456789012345678901234567890123456789012345678901234567890121234567890123456789012345678901234567890123456789012345678901212345678901234567890123456789012345678901234567890123456789012123456789012345678901234567890123456789012345678901234567890121234567890123456789012345678901234567890123456789012345678901212345678901234567890123456789012345678901234567890123456789012123456789012345678901234567890

ICS Newsletter

Contents1 A Science in Trouble3 Message from the Editors3 What about parallel program-

ming and high-end systems?8 ICS Member Profile: Richard

E. Rosenthal17 ITORMS: Call for Papers17 GeNIe 1.018 Message from the Chair19 New Books21 Journal on Computing: Vol.

10, No. 421 Fred Glover Wins von

Neumann Prize22 Seventh INFORMS Comput-

ing Society Conference23 News about Members23 Upcoming Meetings

Volume 19, Number 2 Fall 1998

ComputingSociety

A Science in TroubleAsim Roy

Arizona State University

Can a whole body of science simply unravelwhen confronted by a few simple challenges?Can a large body of scientists overlook somevery simple facts for a long period of time?From the current debate on how the brainlearns, the answer appears to be “yes” for atleast one body of science – artificial neuralnetworks or connectionism. A sampling ofsome recent comments might be a goodindicator of this. The first open and publicadmission that much of the existing science iswrong came from Christoph von der Malsburg,a German neurobiologist and computerscientist affiliated with both the RuhrUniversity of Germany and the University ofSouthern California. In commenting on thechallenge I posed, which claimed that neuralnetworks do not embody brain-like learning,he remarked, ”I strongly believe that the currentparadigm of neural network learning missesvery essential aspects of the learning problem,and I totally concur with the assessment,expressed in your expose, that specific priorknowledge is required as a basis for learningfrom a given domain...I am glad the issue seemsto start picking up momentum. Some Kuhnianrevolution is required here, and as he (T. Kuhn)wrote, such scientific revolutions are always

trouble: Continued on page 10

Page 2: ICS Newsletter - Institute for Operations Research and the

2 INFORMS Computing Society Newsletter

The Institute for Operations Research and the Management Sciences’Computing Society Newsletter

Volume 19, Number 2, Fall 1998

Copyright © 1998 by the Institute for Operations Research and the ManagementSciences (INFORMS). Abstracting and nonprofit use of this material is permittedwith credit to the source. Libraries are permitted to photocopy beyond the limitsof United States copyright law for the private use of their patrons. Instructors arepermittted to photocopy isolated articles for noncommercial classroom use withoutfee.

The ICS Newsletter is published semiannually on April 1st and October 1st by theINFORMS Computing Society (ICS). Manuscripts, news articles, camera-readyadvertisement and correspondence should be addressed to the Editor. Manuscriptssubmitted for publication will be reviewed and should be received by the Editorthree months prior to the publication date. Requests for ICS membershipinformation, orders for back issues of this Newsletter, and address changes shouldbe addressed to:

INFORMS Computing Society Business Office901 Elkridge Landing Road, Suite 400Linthicum, MD 21090-2909phone: (410) 850-0300

ICS OfficersChair

Harlan P. Crowder ([email protected])ILOG, Inc.Silicon Valley,, CA

Chair-ElectRamayya Krishnan ([email protected])Carnegie Mellon UniversityHeinz School of Public Policy & MgmtPittsburgh, PA 15213(412) 268-2174

Secretary/TreasurerBjarni Kristjannson([email protected])Maximal SoftwareArlington, VA 22201(703) 522-7900

Board of DirectorsAndrew Boyd ([email protected])PROS Strategic Solutions, Inc.term expires in 1999

Bruce Golden ([email protected])University of Marylandterm expires in 2001

Christopher Jones ([email protected])University of Washingtonterm expires in 1999

Jeffery L. Kennington ([email protected])Southern Methodist Universityterm expires in 2001

Matthew J. Saltzman ([email protected])Clemson Universityterm expires in 2000

Carol Tretkoff ([email protected])ILOG, Inc.term expires in 2000

Newsletter Staff

Co-Editors

S. Raghavan ([email protected])Decision & Information Technologies Dept.Robert H. Smith School of BusinessUniversity of MarylandCollege Park, MD 20742(301) 405-6139

Tom Wiggen ([email protected])Dept. of Computer ScienceUniversity of North DakotaGrand Forks, ND 58202-9015(701) 777-3477

Associate Editors

John N. Hooker ([email protected])Carnegie-Mellon UniversityGraduate School of Industrial AdministrationPittsburgh, PA 15213(412) 268-3584

Views expressed herein do not constitute endorsement byINFORMS, the INFORMS Computing Society or theNewsletter editors.

This position is available

Contact the editors

Page 3: ICS Newsletter - Institute for Operations Research and the

3Fall 1998

What about parallelprogramming and high-end

systems?Greg Astfalk

Hewlett-Packard Company3000 Waterview Pkwy.

Richardson, TX [email protected]

We are going to talk about parallel computing andhigh-end systems. Parallel processing does notneed any introduction, nor definition. Despite thiswe offer one; to invoke concurrent computationsfor portions of a single application. Key are thewords “concurrent” and “single application.”There are many important applications that are soblatantly parallel that they are not of concern orinterest to us here. Also not of interest here is thenotion of the concurrent throughput of manyindividual jobs. Parallel processing is oftenthought of as a new topic; it isn’t. Parallelism hasbeen around for quite some time, yet despite thiswe can not call it a mature technology.

High-end systems may need a bit of explaining.The issue is that it means different things todifferent people. Someone might, legitimately,consider a cluster of PC’s (e.g., a Beowoulfcluster) as a high-end system. Others might saythat it is a classic Cray supercomputerarchitecture. Both can be correct depending onthe application and numerous other factors, someof which are fiercely “religious” in nature.

Prime-time is that temporal window when mostpeople are watching television. The key is largenumbers relative to any other time. With a viewtoward parallel processing, its prime-time will bewhen large numbers of people are utilizing it.Today this is, without debate, not the case.Considering the universal set of computer usersand programmers it is a minority that use it,understand it, have the time, charter and talent toinvoke it, or can benefit from it. What is holdingit back? Two things. Ease of use andaffordability.

Owing to lack of space we are brief and thus make

Message from the EditorsS. Raghavan (U of Maryland)

Tom Wiggen (U of North Dakota)

This month, we are pleased to feature twointriguing articles by Asim Roy of Arizona StateUniverity and Greg Astfalk of Hewlett-Packard.

Asim tries to expose some of the shortcomings ofneural-nets as they are presently implemented. Arelated article by Professor Roy will appear in theDecember issue of OR/MS Today, and he plans tooffer a tutorial at the ICS Conference in Cancún inJanuary 2000.

Greg Astfalk writes about the present state-of-the-art in our exploitation of parallelism. Gregconcludes that this is not yet ready for prime-time.

We are also pleased to include a news item aboutthe presentation of the von Neumann prize to FredGlover. As you may recall, Fred wrote the featurearticle for the last issue of this newsletter.

As newsletter editors, we are continually on thelookout for regular columns that the ICSmembership will enjoy. One example of this isthe “Member Profile” series which comes aboutthrough the initiative and efforts of associateeditor John Hooker. There must be a number ofcandidates for other regular columns that youwould enjoy reading (web-based applications,web-based teaching and/or learning, views fromindustry, etc.), and there may be some enterprisingindividuals among our readers who would enjoyproducing columns in those special areas for theICS newsletter. If you have an idea for a regularnewsletter column or a special column, please letus know.

The front cover of this newsletter does not containany official ICS identifying marks. AsINFORMS’ newest society, we are at presentwithout official logos of any sort. The society’sofficers are working to remedy this situation.Relay your suggestions to an ICS officer.

Finally, we welcome your letters and emails to theeditors containing news notes, opinions oranything else you want to tell us.

Page 4: ICS Newsletter - Institute for Operations Research and the

4 INFORMS Computing Society Newsletter

6-12 person-months in order to get a fullyfunctional parallel code. An exception to thispoint is that the practitioner could simply use apre-existing ISV (independent software vendor)code that has already been parallelized and solvesthe problem at hand. Then item (3) is not relevant.

So what is different in the context of parallelprogramming? It is a fact that parallelprogramming is (much?) more difficult thansequential programming. Thus it will take uslonger to code the application so it is importantthat the application in question warrants the extraeffort. The justification could be the sheerimportance of the code to the funding (i.e.,developing) organization or the sheer size of thecomputation that requires going to the extra effortto get the time-to-solution down as low aspossible.

When an organization buys a computing resourcethe inherent expectation is that it will enablebetter, or more timely, products and productdevelopment times. This applies to databasesystems, high-end technical servers, or desktop NTsystems. Another characteristic of virtually allorganizations that have large IT budgets forcomputing systems is that they have a communityof users. Should the multiple processors of aparallel system be used to serve multiple users, ormany independent tasks, in throughput mode. Ifthe parallel system is dedicated to a singleapplication for concurrent solution then there aretwo compounding factors. First is that the pprocessors being used concurrently are denied tothe p, or more, users that could be working ontheir applications on the same system. Thusparallel computations represent a lost opportunitycost to the organization unless the parallelapplications’ benefit outweighs it. Thecompounding factor is that parallel applicationsnever fully utilize all of the processors that theyare “using.” We discuss this in the next section.

From the current President’s InformationTechnology Advisory Committee (PITAC), “thereis substantive evidence that current scalableparallel architectures are not well suited for a

only what we feel are the most important points.To do either topic complete justice would require atome of epic proportions. By its nature this notemay be considered subjective, despite our attemptat an objective view. We will address these twopoints from a very pragmatic perspective.Naturally since this note isn’t authored via aconsensus vote it will be viewed either favorablyor as heresy and BS. This author claims fullresponsibility either way. Some might say thisnote is cynical, we respond by saying it is honest.

The prerequisites

If the reader will permit the somewhat parochialperspective we will focus on all endeavors in thisfield, with the exception of research. Ourjustification is that research is by its natureexpected to take time, hard work, false starts, etc.It is by research that the state of the art isadvanced. In this note we are considering thepractical application of parallel computing.

To actually do parallel computing there are threeprerequisites; (1) the need, (2) the machine, and(3) the charter and the time. The need is that theuser is faced with a problem that is large enoughand important enough that it can justify theexpenses and efforts of items (2) and (3). Aproblem that can be accomplished in a reasonabletime on a workstation or server is not likely acandidate. What constitutes a “reasonable” timemeans different things to different people.

To do parallel computation requires a parallelmachine. Thus the user’s organization either hasto have such a system, be compelled enough bythe intended parallel applications to buy one, or tobuy time on one outside the organization. Subtleand implicit in this is that the job (i.e., the need)must have some portion of, or the entire, system toitself to actually run in parallel (there areexceptions to this --gang-scheduling-- but this isbeyond the scope of this note). This reinforces the“need” part. The application must warrant havinga rather expensive computing resource to itself.

The charter and time are necessary since the user’smanagement must allow the programmers anddevelopers the time to make the code run inparallel. It is not uncommon to require efforts of

Page 5: ICS Newsletter - Institute for Operations Research and the

5Fall 1998

number of important applications, especially thosewhere the computations are highly irregular orthose where huge quantities of data must betransferred from memory to support thecalculation.” At least the reader will realize itisn’t just this author that has a less than optimisticview of parallelism.

Measuring parallel programs

In the field of parallel programming there are anynumber of measures of the parallel application;number of processors used, sustained flops,speedup, efficiency, isoefficiency, etc. Whilethese measures do tell us something, they are alloff the mark regarding what really matters. Theonly important measure is that the time-to-solutionis less when the application is run in parallel. Weshould also point out that time-to-solution ought tomean from the time the complete application isstarted until it is finished. Achieving, andreporting on, linear speed-up on a kernel whichrepresents only a portion of the application’s totaltime is misleading to the ultimate end-usersperspective. Failing on this measure may hit themark in a research setting but not in the real-world.

Let’s be honest about what performance we canexpect. Without knowing anything about theapplication or coding this author makes theassertion that on a contemporary RISC processorthe application will get 10% of peak performancein sustained, sequential execution. Be advised thatthe 3-σ on this number is large. When parallelexecution is invoked the assertion, again by thisauthor, is that you lose 50%. Specifically, if yousum the 10% of peak over the p processorsinvolved in the computation and then divide thatby 2 you get a guesstimate on what will bedelivered from the parallel application. The 50%efficiency is for a moderate number of processors,say 8-16. It invariably gets lower with increasingp. We know of cases where only 1% (no typo, wemean one (1) percent) of the aggregateperformance of the system is realized. This is, byany perspective, disappointing. For emphasis, wesay again that these numbers can varysubstantially depending on the application,algorithm, and coding.

Parallel programming

What is the “state of the art” in parallelprogramming today? It involves significantneurons. The actual effort of producing a parallelcode has a significant “human in the loop”component. It would be nice and highly useful ifthere were a way to automagically produceefficient parallel code; Be advised; there isn’t!

We have worked on parallel computing for a longtime and where exactly are we today? It stillrequires careful programming, careful attentionpaid to data locality and control decomposition,and a rather thorough understanding of computerarchitectures. This is further complicated by themany parallel architectures that are in use today.

If you step back far enough from the parallelprogramming issue you have the sense that it ispracticed the same as it was 15 years ago. Theprimary method is explicit decomposition of theapplication (data or control) via message-passing.It is certainly true that MPI, and its forerunnerPVM, are much better in their syntax andfunctionality than the old vendor specificmessaging libraries. However, from theprogrammer’s perspective the task is all toofamiliar.

What MPI has done is significant it that it offersthe greatest common denominator for parallelprogramming. MPI’s portability is significant inthat a MPI application can run on any type ofarchitecture in use today. This includes SMPs,ccNUMA systems, clusters, Internet-basedsystems, etc. To be a bit more quantitative, wehave data that for this author’s company’s userpopulation something like 50-60% of theapplications are MPI-based. This includes acollection of systems that are mostly SMParchitectures.

Reports are available that show efforts in the 6-12person-month range to parallelize a code. This isdone by smart people. The net result for this effortis a far too large disparity between the theoreticalpeak performance of the system and theperformance that the application delivers.

Page 6: ICS Newsletter - Institute for Operations Research and the

6 INFORMS Computing Society Newsletter

The individual SMPs could also be connected is asomewhat looser fashion by, for example, Myrinet.Here the Myrinet connection allows for nearlydirect connection of memory in an individual SMPto that in another. While the latency andbandwidth are not nearly so good as for theccNUMA systems, it is quite respectable.

After this level of hierarchy we have true clustersof SMPs, ccNUMA systems or Myrinet-edsystems. These are connected by standardinterconnects such as Ethernet, FDDI, HiPPI, orany other standard interconnect. The distinctionhere is that there are multiple autonomousoperating systems. The resultant is that the globalview of memory and the i/o space is lost.

Hardware is converging to the ‘clustering’ ofSMPs. It is with reluctance that we use the wordclustering since this term is overloaded in theindustry. If you look at the next hierarchical levelin hardware above the individual processor it isthe SMP. The size of the SMP varies from 2 to 32.The processors constituting the SMP vary fromPentium Pros to almost Gflops PA-RISC or other,processors. The important trait of the SMP is thatit offers shared memory. While this sounds goodat first there are issues that aren’t all thatspectacular.

The days of custom designed (for the technicalonly market) systems are gone and will unlikelyever return. Bandwidth “costs” and if the price ofthe resulting system is too high it won’t sell inlarge enough volume. Vendors must leverage thevolume commodity market space and products inorder to survive. This dictates the use of RISCprocessors. Despite the somewhat less desirablecharacteristics of RISC processors relative toclassic supercomputers, users will need to makethis transition in order to use the high-end systemsof the future. Future high-end systems will becommodity processor based.

High-end vendor economics

At this point we leave the purely technicalconfines of the preceding sections and discusswhat may well be the more important perspective.The economics in the computer industry. It is thisauthor’s opinion that the high-end systems

The difficulty lies in the ability of compilers toautomatically parallelize legacy code. Muchresearch has been done in this area and there issubstantial progress. To the average practitionerthis technology has shown little benefit. If a newsoftware project is being undertaken there are nowquite a few explicitly parallel high-level languagesthat could be used. Using these languages avoidsthe need for message-passing. The portability,longevity, and support of these languages hasrestricted their use to researchers willing and ableto endure the shortcomings. The real test of a newlanguage’s acceptance and success is whenmultiple vendors offer it with support, futurereleases, documentation, etc. Then, and likelyonly then, will the independent software vendors(ISVs) consider using it. Until that time it is notprime-time for these languages.

There exists a tension between the computation’sdecomposition (data and control) and thearchitecture’s forte. At the end of the day it isn’tuncommon to find that the parallel computationefforts are more troubling than the effortsassociated with the original physics. Have weturned physics into difficult programmingproblems?

High-end hardware

Technology has driven the hardware used in theparallel computing arena through several phases.The good news is that there are indications that aconvergence is taking place in the parallelhardware. Despite this there are still multiplechoices and this naturally leads to several ways toprogram parallel applications.

Machines today are hierarchical. We start fromthe single processor and then move to connectedsets of processors that share memory; the SMP.Considering the SMP as a “node” then the nodescan be tightly integrated into a ccNUMA system.In a ccNUMA system any processor in any nodecan directly address any byte of memory in anyother node by processor issued load/storeinstructions. That is to say the tightly integratedset of nodes has the look and feel of a SMP, albeitwith different latencies and bandwidths to someportions of the memory.

Page 7: ICS Newsletter - Institute for Operations Research and the

7Fall 1998

marketplace is largely driven by business realities.

Rick Belluzzio, SGI’s Chairman and CEO, hasstated SGI is, “currently redefining itssupercomputing strategy by folding the traditionalvector technologies into its Origin server linesbecause the current supercomputing model ‘isbankrupt and it doesn’t work.’ ” This is a strongstatement coming from the company that started,defined, and leads the supercomputing industry.

The world-wide high-end server market isapproximately 2 billion dollars a year. This is forservers selling at 1 million dollars and higher. Inthe absence of any competition a vendor can thusbuild a 2 billion dollar business. A vendor that isfiscally healthy can spend about 10% of its annualrevenue on research and development. Is thisenough? The answer is, “no.” The cost to developa high-end system can exceed this amount. Arather extreme example is Cray ComputerCompany. It spent 350 million to develop itsoffering. Making it even more difficult is theconstraint that a high-end vendor really needs tohave more than one design team in place. This isrequired since the time to develop a system islonger than the effective market life of a system.

The solution to this is what is practiced by all thelarge vendors today; have alternate sources ofrevenue other than just the high-end technical.Systems are designed out of more and morecommodity parts and the design is targeted toserve, for example, the technical and commercialmarkets. This provides a larger base to earnrevenue in order to sustain the development ofsystems.

How this came to pass is a confluence of a numberof factors. Among them are (1) the increasinglyexpensive development costs of high-end systems,(2) the shrinking high-end market space, (3) thelack of clear differentiation between commoditycomputers and true supercomputers, and (4) theconvergence between the technical and“commercial” spaces.

On item (3) we make the point that when the Cray1 was first introduced it was substantially betterthan the status quo systems used. Specifically 160

versus 4 Mflops and 12.5 versus 100 nanosecondlatency. Today it is not unusual to find that a high-end commodity workstation can outperform ahigh-end classic supercomputer (on someapplications). Item (4) is typified by theemergence of a new and broader usage model inthe commercial arena. Commercial companies arenow doing decision support, financial modeling,fraud detection, data mining, and otherapplications. These often stress a system in waysthat are similar to technical computing rather thanthe classic transaction processing.

Epilogue

So is parallel programming ready for prime-timetoday? No. Is parallel programming ready for themasses? No. Is parallel programming going tobecome so easy that it will be usable by themasses? Not for years. Is parallel computationimportant? Yes. Is there a future for parallelprocessing? Definitely!

The advances that are required to make thepreceding paragraph more optimistic are almostexclusively in algorithms and software. The levelsof spending and effort are not in step with theimportance of these areas. Parallel hardwarewhile it is still maturing, developing, andconverging is further along in maturity than is theparallel software.

The biggest issue that is inhibiting the main-streaming of parallelism is its inherent difficulty,combined with the mixed results. After significanthard work by talented people some applicationsshow impressive performance. This statementneeds to read, “After programming by the averageprogrammer almost any application can achievegood parallel performance.” Alas this is sometime in the future. When it is a true statement thenparallel procesing’s prime-time will have arrived.

We do look forward to the wider acceptance ofparallelism. It will be interesting to participate inthe unfolding of the plot of this on-going saga.

Greg Astfalk is the Chief Scientist of Hewlett-Packard’s High Performance Systems Division inRichardson, TX.

Page 8: ICS Newsletter - Institute for Operations Research and the

8 INFORMS Computing Society Newsletter

ICS Member Profile: Richard E. Rosenthal

He never forgot the first lesson of ProfessorNaddor’s class, which was that all the techniquesof OR exist for the purpose of solving realproblems. The first real problem Rick worked onwas a consulting job as a graduate student. “Itwas offered to all the professors first, but theythought the client was a flake and I wouldn’t haveknown any better,” he says. The job did not turnout successfully and this was a defining incidentin his career. The problem involved a furniturefactory that the client had recently purchased.Through years of mismanagement, it hadaccumulated extremely unbalanced partsinventories. “Imagine a plant with a week’ssupply of table tops and twenty year’s worth oftable legs. To make matters worse, the finishedproducts are not particularly profitable.” Theclient wanted to know whether to buy more tabletops or to scrap the table legs or somecombination of the two. There were 100 productsfacing this dilemma, and he wanted answersimmediately.

“I went home after the first meeting and designeda beautiful linear program. Beaming with pride, Iwent into the Georgia Tech computer center thenext day, opened a private account, and asked foraccess to an LP solver. After all my courses onmath programming algorithms at Hopkins andTech, how hard could it be to use someone else’ssoftware implementation of what I thought Iunderstood so well? I figured it would take lessthan an hour to master the software, so I made anappointment with the client to meet a little laterand told him to bring the data. I promised not togo home before presenting him with the optimalsolution.”

The only LP software available was LP1108,running on the Univac 1108. “The manual wastwice the size of the Manhattan phone book, andit required a lot of experience in Job ControlLanguage, which I had never heard of. Suffice itto say, I was not prepared for my meeting and didnot get that beautiful LP formulation even close torunning that day. No one had ever taught or evenmentioned to me what was involved in creating a

I was very fortunate to get introduced tooperations research as an undergraduate at JohnsHopkins University. John Liebman was the firstto tell me about OR and Eliezer Naddor taught myfirst class in it, using Harvey Wagner’s book.After two weeks of Naddor’s great lectures andWagner’s fun problems, I was hooked. The nextyear, I had a two-semestercourse in network flows andgraphs with MannyBellmore, using Ford andFulkerson and lots ofclassic papers from theliterature. Coming fromthe first generation of myfamily to go to college, Ihad absolutely no idea ofwhat being an academicwas all about. Those threeprofessors influenced me more than they couldhave imagined. Bellmore’s course introduced meto the idea of research and it led me to chooseoptimization as a specialty.”

That’s how Rick Rosenthal describes his start inOR. He graduated from Johns Hopkins in 1972and then completed a Ph.D. at Georgia Tech in1975. His first faculty position was inmanagement science at the University ofTennessee. Then in 1984, he went for a one-yearvisit to the Naval Postgraduate School at theinvitation of Jerry Brown and has been there eversince. He became Editor in Chief of NavalResearch Logistics in 1988. A year and a halfago, he took over as Chairman of the NPSOperations Research Department, one of thelargest, oldest, and, according to NPS Dean PeterPurdue, one of the strongest OR programs in theU.S.

With over 40 faculty, close to 200 graduatestudents, and $2.8 million in 1998 externalresearch funding, managing the OR departmenttakes a lot of time. But Rick says he is countingon avoiding administration “as a life sentence,” sohe is keeping a strong interest and involvement inresearch.

Page 9: ICS Newsletter - Institute for Operations Research and the

9Fall 1998

ICS member profiles are intended to keep us up to date onwhat some of our members are doing. Profiled persons comefrom academia, industry, military and civilian government.

matrix generator for an LP.”

As a footnote of the story, credit must be given toBruce Schmeiser, a fellow Georgia Tech graduatestudent, who had excellent computing skills andindustrial experience. “Bruce was already on hisway to becoming a famous name in simulationand couldn’t care less about LP. He rigged up aclever little FORTRAN program that saved myneck.”

The result of this humbling experience was thatRick’s research and teaching in optimization havealways contained an emphasis on implementation.It also explains his joy and excitement when, first,conversational solvers like LINDO came alongand then algebraic modeling languages likeAMPL and GAMS emerged. Rick startedteaching a 4-day short course on optimizationmodeling using GAMS in 1988, a course he stilloffers twice a year. Some companies send peopleevery year to the class. Past students oftenprovide example models or come back as guestspeakers. “I love to demonstrate to people withreal problems how algebraic modeling languagesmake it possible to experiment with intricatemathematical programming approaches veryquickly and effectively.”

Rick’s optimization research began withalgorithmic developments. “I have a fewfavorites from those days. The paper onnonlinear networks applied to hydropower [1] gota lot of attention in the water resourcescommunity. My work with Steve Brady on aninteractive computer graphical solution to alocation problem [2] does not read too badly afterall these years. Terry Harrison’s dissertation onmulti-objective optimization applied to forestry[3] helped me learn about both fields. Anotherpet project related to my favorite result fromBellmore’s networks class: the basis-tree theoremfor pure network flows. Thanks to the work ofmy then-future NPS colleagues Jerry Brown andGordon Bradley, among others, everyone knewthat this theorem lets you do fantastically efficientcomputing with labeling algorithms on the basis-tree instead of linear algebra on the basis matrix.In a paper that probably no more than threepeople have read, I showed there exists anothertree corresponding to the basis-inverse and youcan implement the same algorithms just as well

with this inverse-tree [4].”

Current research is focused on militaryapplications of optimization, jointly with studentsand colleagues at the Naval Postgraduate School,such as [5]. “Our students are very special. Theyhave already figured out what they want to bewhen they grow up, and have come to us afteralready proving they are first-rate performers attheir chosen professions. They enable the facultyto develop expertise in military problem areas towhich OR and computers can be applied.” Onerecent example of student research supervised byProfessor Rosenthal is Navy Lieutenant ScottKuykendall’s thesis [6], which grew directly outof a situation that had frustrated the lieutenantseveral times in his previous assignment as theTomahawk strike officer on an Aegis cruiser.Lt.Col. Steve Baker of the Air Force recentlycompleted a Ph.D. dissertation [7], which is afinalist for the INFORMS George B. DantzigPrize. With Professors Laura Melody Williamsand Rosenthal of NPS, and David Morton of theUniversity of Texas at Austin, Baker developedan airlift optimization model that has been usedby the Air Force to answer questions aboutaircraft fleet selection, airfield infrastructureinvestment, and airlift concepts of operation.

Rick Rosenthal summarizes his career: “I havebeen extremely lucky to have found OR at anearly age, and to have worked with great teachers,colleagues and students all along the way. Dr.Naddor passed away several years ago, but I thinkhe would be very happy to see the applicationsfocus of my work.”[1]. Rosenthal, Richard E., “A Nonlinear Network Flow Algorithm for

Maximization of Benefits in a Hydroelectric Power System,” OperationsResearch, Vol. 29, 763-786 (1981).

[2]. Brady, Stephen D. and Richard E. Rosenthal, “Interactive Computer GraphicalSolutions of Constrained Minimax Location Problems,” AIIE Transactions,Vol. 12, 241-248 (1980).

[3]. Harrison, Terry P. and Richard E. Rosenthal, “An Implicit/Explicit Approach toMultiobjective Optimization with an Application to Forest ManagementPlanning,” Decision Sciences, Vol. 19, 190-210 (1988).

[4]. Rosenthal, Richard E., “Representing Inverses in Pure Network FlowOptimization,” European Journal of Operational Research, Vol. 23, 356-366,(1986).

[5]. Rosenthal, Richard E. and William J. Walsh, “Optimizing Flight Operations foran Aircraft Carrier in Transit,” Operations Research, Vol. 44, 305-312, (1996).

[6]. Kuykendall, Scott D., “Optimizing Selection of Tomahawk Cruise Missiles,”MS thesis in Operations Research, Naval Postgraduate School, March 1998.

[7]. Baker, Steven F., “A Cascade Approach for Staircase Linear Programs with anApplication to Air Force Mobility Optimization,” Ph.D. dissertation inOperations Research, Naval Postgraduate School, June 1997.

Page 10: ICS Newsletter - Institute for Operations Research and the

10 INFORMS Computing Society Newsletter

preceded by rising unrest in the community..”And Malsburg is one of the founders of thisfield. Another founder of this field and a pastpresident of the International Neural NetworkSociety (INNS) confided to me that “the neuro-boom is over.” But many other scholars havekept on fighting the arguments against thecurrent science on brain-like learning. Anotherfounder of the field and a past president ofINNS publicly disagreed with me at the recentdebate in Alaska, saying: “In brief, I disagreewith everything he (Asim Roy) said.”

Many distinguished scholars haveparticipated in the two open, public debatesat the last two international conferences onneural networks. These debates centered onvarious aspects of brain-like learning asdiscussed later in this article. The first debateat the International Conference on NeuralNetworks (ICNN’97) in Houston, Texas inJune, 1997, included four past presidents ofINNS and five of the plenary speakers. Thesecond debate at the World Congress onComputational Intelligence (WCCI’98) inAnchorage, Alaska in May, 1998, includedfive past presidents of INNS, founders andgurus of the field. A summary of the firstdebate has been published in the INNSNewsletter of May, 1998, and on the Internetthrough the various neural network-relatedmailing lists. A summary of the seconddebate is under preparation.

The debate about connectionism is nothing new.The argument between the symbol systemhypothesis of artificial intelligence and themassively parallel system conjecture of artificialneural networks or connectionism has still notabated. Marvin Minsky of MIT characterizedconnectionism as “naïve” at the first internationalconference on neural networks in San Diego in1988. And Minsky and Seymour Papert not onlyshowed the limitations of the earlier simple neuralnetworks, the perceptrons, but were also the firstones to raise the deeper question of computationalcomplexity of learning algorithms (“Epilogue:

The New Connectionism” in [8]). But the neuralnetwork field moved along heedlessly with itsresearch agenda, ignoring all the deeper andmore disturbing questions raised by thoughtfulcritics. However, a scientific field is destined tostumble sooner or later when it tries to skirtlegitimate questions about its founding ideas.Now faced with fundamental challenges to theassumptions behind their brain-like learningalgorithms, prominent researchers in the field arefinally calling for a “shake up of the field ofneural networks” and for its “rebirth.”

Some background information onartificial neural networks

Connectionism or artificial neural networksis the field of science that tries to replicatebrain-like computing. The brain isunderstood to use a parallel computingmechanism where each computing element(a neuron or brain cell in the terminology ofthis science) in this massively parallel systemis envisioned to perform a very simplecomputation, such as y

i = f(z

i), where z

i is

assumed to be a real valued input, yi is either

a binary or a real valued output of the ith

neuron, and f a nonlinear function (seeFigure 1). The nonlinear function f, alsocalled a node function, takes different forms

in differentmodels of theneuron; atypical choicefor the nodefunction is astep function ora sigmoidfunction. Theneurons gettheir inputsignals fromother neuronsor fromexternalsources such as

various organs of the body like the eyes, theears and the nose. The output signal from a

trouble: continued from page 1

Output Layer

Hidden Layer

Input Layer

Figure 1

wi2

wi1

win

xi1

xi2

xin

yi

zi = Σxijwij+θi

ith neuron

yi = f(zi)

Page 11: ICS Newsletter - Institute for Operations Research and the

11Fall 1998

neuron may be sent to other neurons or toanother organ of the body.

Studies in neuroscience and neurobiologyshow that different parts of the brain performdifferent tasks such as storage of short orlong term memory, language comprehension,object recognition and so on. A particulartask is performed by a particular network ofcells (hence the term neural networks)designed and trained for that task through theprocess of learning or memorization. Thesenetworks, when invoked to perform aparticular task, then send their outputs toother parts of the brain or to an organ of thebody.

A network can have many layers of neurons,where the outputs of one layer of neuronsbecome the inputs to the next layer ofneurons. And a network can have more thanone output signal; thus the output layer canhave more than one neuron. Different neuralnetwork models assume different modes ofoperation for the network, dependingsomewhat on the function to be performed. Aneural network model for patternclassification is often conceived to be afeedforward type network where the inputsignals are propagated through differentlayers of the network to produce outputs atthe output layer. On the other hand, a neuralnetwork model for memory is oftenconceived to be of the feedback type (alsocalled recurrent networks or nonlineardynamical systems) where the outputs of thenetwork are fed back to the network asinputs. This process of feedback continuesuntil the network converges to a stable set ofoutput values or continuously cycles among afixed set of output values.

Let xi = (xi1, x

i2, ... , x

in) be the vector of input

signals to the ith neuron, the inputs signalsbeing from other neurons in the network orfrom external sources. Neural networkmodels assume that each input signal x

ij to ith

neuron is “weighted” by the strength of the ith

neuron’s connection to the jth source, wij. The

weighted inputs, wijx

ij, are then summed to

form the actual input zi to the node function f

at the ith neuron: zi = Σ w

ijx

ij + θ

i, where θ

i is

a constant, called the threshold value. Asmentioned before, some typical nodefunctions are (1) the step function, wheref(z

i) = 1 if z

i ≥ 0, and f(z

i) = 0 otherwise, and

(2) the sigmoid function, wheref(z

i) = 1/(1 + e- zi).

A network of neurons is made to perform acertain task (memory, classification and soon) by designing and training an appropriatenetwork through the process of learning ormemorization. The design of a networkinvolves determining (a) the number oflayers to use, (b) the number of neurons touse in each layer, (c) the connectivity patternbetween the layers and neurons, (d) the nodefunction to use at each neuron, and (e) themode of operation of the network (e.g.feedback vs. feedforward). The training of anetwork involves determining the connectionweights [w

ij] and the threshold values [θ

i]

from a set of training examples. For somelearning algorithms like back-propagation[14,15], the design of the network isprovided by the user or some other externalsource. For other algorithms like AdaptiveResonance Theory (ART) [5], reducedcoulomb energy (RCE) [10], and radial basisfunction (RBF) networks [9], the design ofthe network is accomplished by the algorithmitself, although other parameter values haveto be supplied to the algorithm on a trial anderror basis to perform the design task.

The training of a network is accomplishedby adjusting the connection weights [w

ij] by

means of a local learning law. A locallearning law is a means of graduallychanging the connection weights by anamount ∆w

ij after observing each training

example. A learning law is based on thegeneral idea that a network is supposed toperform a certain task and that the weights

Page 12: ICS Newsletter - Institute for Operations Research and the

12 INFORMS Computing Society Newsletter

have to be set such that the error in theperformance of that task is minimized. Alearning law is local because it is conceivedthat the individual neurons in the network arethe ones making the changes to theirconnection weights or connection strengths,based on the error in their performance.Local learning laws are a direct descendentof the idea that the cells or neurons in a brainare autonomous learners. The idea of“autonomous learners” is derived, in turn,from the notion that there is no homunculusor “a little man” inside the brain that “guidesand controls” the behavior of different cellsin the brain. The “no homunculus” argumentsays that there couldn’t exist a distinct andseparate physical entity in the brain thatgoverns the behavior of other cells in thebrain. In other words, as the argument goes,there are no “ghosts” in the brain. So anynotion of “extracellular control” of synapticmodification (connection weight changes) isnot acceptable to this framework. Manyscientists support this notion (of cells beingautonomous learners) with examples ofphysical processes that occur without anyexternal “control” of the processes, such as ahurricane.

So, under the connectionist theory oflearning, the connection weight w

ij(t), after

observing the tth training example, is givenby: w

ij(t) = w

ij(t-1) + ∆w

ij(t), where ∆w

ij(t) is

the weight adjustment after the tth exampleand is determined by the local learning law.Donald Hebb [6] was the first to propose alearning law for this field of science andmuch of the current research on neuralnetworks is on developing new learninglaws. There are now hundreds of locallearning laws, but the most well-knownamong them are back-propagation [14,15],ART [5] and RBF networks [9]. To give anexample, the back propagation learning lawis as follows: ∆w

ij(t) = - η(∂E/∂w

ij(t)) +

α∆wij(t-1). Here η is the learning rate (step

size) for the weight update at step t and α is a

momentum gain term. E is the mean-squareerror of the whole network based on somedesired outputs, in a supervised mode oflearning, where a teacher is present toindicate what the correct output should be forany given input. Back-propagation is asteepest descent algorithm and -∂E/∂w

ij(t) is

the steepest descent direction (negative of thegradient).

The distinction between memory andlearning

Two of the main functions of the brain arememory and learning. There are of coursemany categories of memory (short term,medium term, long term, working memory,episodic memory and so on) and of learning(supervised, unsupervised, inductive,reinforcement and so on). In order tocharacterize the learning behavior of thebrain, it is necessary to distinguish betweenthese two functions. Learning generallyimplies learning of rules from examples.Memory, on the other hand, implies simplestoring of facts and information for laterrecall (e.g. an image, a scene, a song, aninstruction). Sometimes these terms are usedinterchangeably in the literature, and ineveryday life: memory is often confused withlearning. But the processes of memorizationare different from that of learning. Somemory and learning are not the same.

Learning or generalization from examples

Learning of rules from examples involvesgeneralization. Generalization implies theability to derive a succinct description of aphenomenon, using a simple set of rules orstatements, from a set of observations of thephenomenon. So, in this sense, the simplerthe derived description of the phenomenon,the better is the generalization. For example,Einstein’s E = MC2 is a superbly succinctgeneralization of a natural phenomenon. Andthis is the essence of learning from examples.So any brain-like learning algorithm must

Page 13: ICS Newsletter - Institute for Operations Research and the

13Fall 1998

exhibit this property of the brain - the abilityto generalize. That is, it must demonstratethat it makes an explicit attempt to generalizeand learn. In order to generalize, the learningsystem must have the ability to design theappropriate network.

The problems of connectionism - somemajor misconceptions about the brain

A misconception - no synaptic changesignals are allowed to the cells from othersources within the brain

The notion that each neuron or cell in the brainis an “autonomous/independent learner” is oneof the fundamental notions in this field. Underthis notion, it is construed that individual cellsmodify their synaptic strengths (connectionweights) based solely on their “input and outputbehavior.” The input and output informationof a cell may include information about theerror in the performance of the given task bythe network and an individual cell’scontribution to that error; see for example theback-propagation learning law in the lastsection. This notion implies that no otherphysical entity external to the cell is allowedto “signal” it to adjust its connection strengths.All of the well-known local learning lawsdeveloped to date most faithfully adhere to thisnotion [2, 3, 5, 6, 7, 9, 10, 14, 15]. However,there is no neurobiological evidence to supportthis premise. In fact, there is a growing bodyof evidence that says that extrasynapticneuromodulators influence synapticadjustments “directly” [7]. The neurobiologicalevidence shows that there are many differentkinds of neurotransmitters and receptors andmany different cellular pathways for them toaffect cellular changes. Cellular mechanismswithin the cell are used to convert these“extracellular” signals into long-lastingchanges in cellular properties. So theconnectionist conjecture that no other physicalentity directly signals changes to a cell’sbehavior is a major misconception about the

brain. Beyond the neurobiological evidence,this conjecture is also logically inconsistent,as discussed later.

Another misconception - the brain does notcollect and store any information about theproblem prior to actual learning

In connectionism, brain-like learningalgorithms cannot store any training examples(or any other information, for that matter)explicitly in its memory (in some kind ofworking memory, that is, that can be readilyaccessed by the learning system in order tolearn) [2, 3, 5, 6, 7, 9, 10, 14, 15]. The learningmechanism can use any particular trainingexample presented to it to adjust whatevernetwork it is learning in, but must forget thatexample before examining others. This is theso-called “memoryless learning” property,where no storage of facts/information isallowed. The idea is to obviate the need forlarge amounts of memory to store a largenumber of training examples or other facts.Although this process of learning is verymemory efficient, it can be very slow and time-consuming, requiring lots of training examples,as demonstrated in [11,12]. However, the mainproblem with this notion of memorylesslearning is that it is completely inconsistentwith the way humans actually learn; it violatesvery basic behavioral facts. Rememberingrelevant facts and examples is very much a partof the human learning process; it facilitatesmental examination of facts and informationthat is the basis for all human learning. And inorder to examine facts and information andlearn from it, humans need memory; they needto remember facts. But connectionism has noprovision for it.

There are other logical problems with the ideaof memoryless learning. First, one cannot learn(generalize, that is) unless one knows what isthere to learn (generalize). And one can findout what is there to learn “only by” collectingand storing some information about theproblem at hand. In other words, no system,

Page 14: ICS Newsletter - Institute for Operations Research and the

14 INFORMS Computing Society Newsletter

biological or otherwise, can “prepare” itself tolearn without having some information aboutwhat is there to learn (generalize). And in orderto generalize well, one has to look at a wholebody of information relevant to the problem,not just bits and pieces of information at a timeas is allowed in memoryless learning. So thenotion of “memoryless learning” is a veryserious misconception in these fields, and istotally inconsistent with external observationsof the human learning process.

A third misconception - the brain learnsinstantly from each and every learningexample presented to it

A major dilemma for this field is explainingthe fact that sometimes human learning is notinstantaneous, but may occur much later,perhaps at a distant point in time, based oninformation already collected and stored in thebrain. The problem lies with the fundamentalbelief in the connectionist school that the brainlearns “instantaneously.” Instantaneous, that is,in the sense that it learns promptly from eachand every learning example presented to it byadjusting the relevant synaptic strengths orconnection weights in the network. And it evenlearns from the very first example presented toit! The learning, as usual, is accomplished byindividual neurons using some kind of a “locallearning law.” Note that “instantaneouslearning” is simply a reflection of “memorylesslearning;” just the opposite side of the samecoin.

A fourth misconception - the networks arepredesigned and externally supplied to thebrain; and the learning parameters areexternally supplied too

Another major dilemma for this field isexplaining the fact that a network design, andother types of algorithmic information, has tobe externally supplied to some of their learningsystems, whereas no such information isexternally supplied to the human brain. In fact,not just one, but many different network

designs (and other parameter information) areoften supplied to these learning systems on atrial and error basis in order for them to learn[5, 9, 10, 14, 15]. However, as far as is known,no one has been able to supply any networkdesign or learning parameter information to ahuman brain. Plus, the whole idea of“instantaneous and memoryless learning” iscompletely inconsistent with their trial anderror learning processes; there is supposed tobe no storage of learning examples in thesesystems for such a trial and error process totake place. In other words, no such trial anderror process can take place unless there ismemory in the system, which they disallow.

In order for humans to generalize well in alearning situation, the brain has to be able todesign different networks for differentproblems - different number of layers, numberof neurons per layer, connection weights andso on - and adjust its own learning parameters.The networks required for different problemsare different, it is not a “same size fits all” typesituation. So the networks cannot come “pre-designed” in the brain; they cannot be inheritedfor every possible “unknown” learning problemfaced by the brain on a regular basis. Since noinformation about the design of the network isever supplied to the brain, it implies thatnetwork design is performed internally by thebrain. Thus it is expected that any brain-likelearning system must also demonstrate thesame ability to design networks and adjust itsown learning parameters without any outsideassistance. But the so-called autonomouslearning systems of connectionism depend onexternal sources to provide the network designto them; hence they are inherently incapableof generalizing without external assistance.This implies again that connectionist learningis not brain-like at all.

Other logical problems with connectionistlearning

But there are more problems with theseconnectionist ideas. Strict autonomous local

Page 15: ICS Newsletter - Institute for Operations Research and the

15Fall 1998

learning implies pre-definition of a network “bythe learning system” without having seen asingle training example and without having anyknowledge at all of the complexity of theproblem. There is no system, biological orotherwise, that can do that in a meaningful way;it is not a “feasible idea” for any system. Theother fallacy of the autonomous local learningidea is that it acknowledges the existence of a“master system” that provides the networkdesign and adjusts the learning parameters sothat autonomous learners can learn. Soconnectionism’s autonomous learners, in theend, are directed and controlled by othersources after all! So these connectionist ideas(instantaneous learning, memoryless learningand autonomous local learning) are completelyillogical, misconceived and incompatible withwhat can be externally observed of the humanlearning process.

Conclusions

One of the “large” missing pieces in theexisting theories of artificial neural networksand connectionism is the characterization ofan autonomous learning system such as thebrain. Although Rumelhart [15] and othershave clearly defined (conjectured) the“internal mechanisms” of the brain, no onehas characterized in a similar manner theexternal behavioral characteristics that theyare supposed to produce. As a result, thefield pursued algorithm development largelyfrom an “internal mechanisms” point of view(local, autonomous learning, memorylesslearning, and instantaneous learning) ratherthan from the point of view of “externalbehavioral characteristics” of humanlearning. That flaw is partly responsible forits current troubles. It is essential that thedevelopment of learning algorithms beguided primarily by the need to reproduce aset of sensible, well-accepted externalcharacteristics. If that set of externalcharacteristics cannot be reproduced by acertain conjecture about the internal

mechanisms, than that conjecture is notvalid.

This article essentially described some of theprevailing notions of connectionism andshowed their logical inconsistencies and howthey fail to properly account for some verybasic aspects of human learning. So there isdefinitely a need for some new ideas about theinternal mechanisms of the brain. From the lasttwo debates and from recent neurobiologicalevidence, it appears that a very convincingargument can be made that there are subsystemswithin the brain that control other subsystems.This “control theoretic” notion, which allowsexternal sources to directly control a cell’sbehavior and perform other tasks, is findinggrowing acceptance among scientists [13]. Thisnotion has many different labels at this point:non-local means of learning, global learningand so on. It would not be fair if it is notacknowledged that such control theoreticnotions are already used, in one form oranother, in almost all connectionist learningsystems. For example, all constructive learningalgorithms (e.g. [5, 9, 10]) use non-local meansto “decide” when to expand the size of thenetwork. And the back-propagation algorithmitself [14, 15] depends on a non-local, externalsource to provide it the design of a network inwhich to learn. So connectionist systemsinadvertently acknowledge this “controltheoretic” idea, by using a “master orcontrolling subsystem” that designs networksand sets learning parameters for them. In otherwords, as baffling as it may sound, the controltheoretic ideas have been in use all along; theyare nothing new. Only recently has such non-local means of learning been used effectivelyto develop robust and powerful learningalgorithms that can design and train networksin polynomial time complexity [1, 4, 11, 12].Polynomial time complexity, by the way, is anessential computational property for brain-likeautonomous learning systems.

In addition, a control theoretic framework

Page 16: ICS Newsletter - Institute for Operations Research and the

16 INFORMS Computing Society Newsletter

resolves many of the problems and dilemmasof connectionism. Under such a framework,learning need no longer be instantaneous andcan wait until some information is collectedabout the problem. Learning can always beinvoked by a controlling subsystem at a laterpoint in time. This would also facilitateunderstanding the complexity of the problemfrom the information that has been collectedand stored already. Such a framework wouldalso resolve the network design dilemma andthe problems of algorithmic efficiency thathave plagued this field for so long [1, 4, 11,12]. So one can argue very strongly for such atheory of the brain both from a computationalpoint of view and from the point of view ofbeing consistent with externally observedhuman learning behavior.

Acknowledgements: I thank Fred Glover,Leon Lasdon, and others for manysuggestions that improved this article. FredGlover, in particular, was a significantinspiration behind our work that uses linearprogramming models to design and trainneural networks. In a memo to a well-knowncognitive scientist at his university, he calledthis debate “an epic debate.”

References1. Bennett, K.P. and Mangasarian, O.L.

(1992). Neural Network Training viaLinear Programmig. In P.M. Pardalos(ed), Advances in Optimization andParallel Computing, North Holland,Amsterdam.

2. Churchland, P. (1989). On the Nature ofTheories: A NeurocomputationalPerspective. Reprinted as chapter 10 inHaugeland, J. (ed), Mind Design II, 1997,MIT Press, 251-292.

3. Churchland, P. and Sejnowski, T. (1992).The Computational Brain. MIT Press,Cambridge, MA.

4. Glover, F. (1990). Improved LinearProgramming Models for DiscriminantAnalysis. Decision Sciences, 21, 4, 771-

785.5. Grossberg, S. (1988). Nonlinear neural

networks: principles, mechanisms, andarchitectures. Neural Networks, 1, 17-61.

6. Hebb, D. O. (1949). The Organization ofBehavior, a Neuropsychological Theory.New York: John Wiley.

7. Levine, D. S. (1998). Introduction toNeural and Cognitive Modeling, Hillsdale,NJ: Lawrence Erlbaum.

8. Minsky, M. and Papert, S. (1988).Perceptrons. The MIT Press, Cambridge,MA.

9. Moody, J. & Darken, C. (1989). FastLearning in Networks of Locally-TunedProcessing Units, Neural Computation,1(2), 281-294.

10. Reilly, D.L., Cooper, L.N. and Elbaum,C. (1982). A Neural Model for CategoryLearning. Biological Cybernetics, 45, 35-41.

11. Roy, A., Govil, S. & Miranda, R. (1995).An Algorithm to Generate Radial BasisFunction (RBF)-like Nets forClassification Problems. Neural Networks,Vol. 8, No. 2, pp. 179-202.

12. Roy, A., Govil, S. & Miranda, R. (1997).A Neural Network Learning Theory and aPolynomial Time RBF Algorithm. IEEETransactions on Neural Networks,Vol. 8,No. 6, pp. 1301-1313.

13. Roy, A. (1998). Panel discussion atICNN’97 on connectionist learning.INNS/ENNS/JNNS Newsletter, appearingwith Vol. 11, No. 2, of Neural Networks.

14. Rumelhart, D.E., and McClelland,J.L.(eds.) (1986). Parallel DistributedProcessing: Explorations inMicrostructure of Cognition, Vol. 1:Foundations. MIT Press, Cambridge,MA., 318-362.

15. Rumelhart, D.E. (1989). TheArchitecture of Mind: A ConnectionistApproach. Chapter 8 in Haugeland, J.(ed), Mind Design II, 1997, MIT Press,205-232.

Page 17: ICS Newsletter - Institute for Operations Research and the

17Fall 1998

Software Announcement

GeNIe 1.0

We are pleased to announce GeNIe 1.0, afree development environment for graphicaldecision-theoretic models.

At the moment it implements Bayesiannetworks and influence diagrams with, whatwe believe to be, a pleasant and reliable userinterface. It has hierarchical submodels, awindows-style tree view, Noisy-OR nodes,deterministic nodes, multiple decision nodes,relevance reasoning that includesdesignating nodes as targets, multiple utilitynodes, linearly-additive Multi-Attributeutility nodes (we will have generalizedMAU nodes in the future), value ofinformation computation, several Bayesiannetworks algorithms to choose from, on-screen comments, an extensive beginner-oriented HTML-based help system, andmany other useful features that one wouldwant from a development environment forgraphical models. GeNIe is the mainresearch and teaching vehicle at DecisionSystems Laboratory, so naturally it willevolve as time goes.

GeNIe runs on Windows 95/NT computers.It comes with SMILE (Structural Modeling,Inference, and Learning Engine), a portablelibrary of C++ classes that can be embeddedin applications that are based on Bayesiannetworks and influence diagram modelsdeveloped using GeNIe. SMILE is compiledso far for Windows 95/98/NT and Unix. Ifthere is sufficient demand, we will considerrecompiling it for other environments.

ITORMS: Call for Papers

ITORMS is the first electronic journal ofINFORMS.

ITORMS uses the interactive electronicmedium to deliver timely and updatedinformation employing not just thetextual and graphical format but alsoother visual and algorithmic information.The Interactive Transactions of OR/MSpublishes original, scholarly high qualityarticles and bibliographies that provide aperspective view of the OR/MSdiscipline and exploit the interactivityoffered by the Internet. Interactivity hastwo dimensions in this context. First, thepublished articles are constantly updatedby the authors (contributing editors) tooffer the most recent literature on thetopic of interest. Second, interactivityallows the readers to learn not only fromthe paper being read, but also provideaccess to other related resources that areavailable or will be developing on theInternet.

ICS loyal members such as HarveyGreenberg, Chris Jones, and FredMurphy have published in ITORMS.You are invited to submit a paper forreview for publication in ITORMS.Check out the ITORMS web site at http://www.informs.org/Pubs/ITORMS, orcontact the Editor, Ramesh Sharda [email protected]

GeNIe: Continued on page 18

Page 18: ICS Newsletter - Institute for Operations Research and the

18 INFORMS Computing Society Newsletter

Message from the ChairHarlan P. Crowder

ILOG, IncSilicon Valley, [email protected]

After a lot of tenacious effort, the Computer Sciences Technical Section has a new name; we arenow the INFORMS Computing Society. The designation of “Society” for INFORMS subgroups isreserved for SIGs and sections that have demonstrated a sustained commitment over time and havesignificantly advanced a particular aspect of operations research and management sciences. On bothcounts, the INFORMS Computing Society has earned this special designation.

From the early days of our discipline, the computational aspects of OR/MS have advanced inparallel with computer and computational science. Much of what we take for granted today incomputational OR/MS owes a debt to computer science: the invention of sophisticated data structures forefficient algorithms; the application of computational complexity that allows us to understand thecapabilities, and limitations, of our problem-solving methods; the adoption of advances in computerprogramming languages to the expression of mathematical models; and the introduction of computationaltechniques that mimic natural systems, including simulated annealing from physical sciences and geneticalgorithms for the biological sciences.

I have always been especially impressed by how the whole study of computational complexity inthe 1970s and early 1980s gave us a fundamental understanding of the relative difficulty of a wide rangeof OR/MS problems. I once asked Philip Wolfe, OR/MS legend extraordinaire, what he thought was theusefulness of complexity theory on the every-day, real-world application of OR/MS. Phil said that waseasy; the practitioner could now confidently march into the boss’ office and say, “Boss, I don’t have anyidea how to solve this problem you asked me to solve. But there are a lot of very smart people that havelooked at this same problem, and they don’t know how to solve it either!” Phil has a way of getting to thepoint.

Our new designation as a Society is a special honor, but it also means we have moreresponsibility. New moves are afoot in INFORMS to give more leadership to subgroups, including theopportunity to have Society meetings on an annual basis. This means that INFORMS Computing Societymembers will need to work harder and get more involved to make the Society a continuing success. Iinvite you attend our Society business meetings at INFORMS meetings in Seattle, Cincinnati andPhiladelphia, and to get actively involved. And I especially urge you to plan on attending our next bigINFORMS Computing Society conference in Cancun, Mexico, the first week of 2000.

GeNIe and SMILE are free for the taking for research, teaching, and personal use. (We do askauthors of publications in which GeNIe and SMILE played a role to acknowledge it.) They can bedownloaded from http://www2.sis.pitt.edu/~genie. We believe that the programs will prove usefulfor the community. May the smiling GeNIe grant you your three wishes :-).

Marek J. Druzdzel ([email protected]) University of Pittsburgh, Decision Systems LaboratorySchool of Information Sciences, Intelligent Systems Program, and Medical Informatics Training Program

GeNIe: continued from page 17

Page 19: ICS Newsletter - Institute for Operations Research and the

19Fall 1998

New Books

Network Optimization: Continuous andDiscrete Models by Dimitri P. Bertsekas,Massachusetts Institute of Technology.Athena Scientific, 1998, 608 pages, ISBN 1-886529-02-7.

This book provides an introductory, yetcomprehensive and up-to-date treatment oflinear, nonlinear, and discrete networkoptimization problems, and the analyticaland algorithmic methodology for solvingthem. It also provides a guide to importantpractical network applications and theiralgorithmic solution, highlights the interplaybetween continuous and discrete models, andtreats extensively the associated analyticaland algorithmic issue at an introductory andaccessible level.

Among its special features, the book:

- provides a comprehensive account of thetheory and the practical application of theprincipal algorithms for linear network flowproblems, including simplex, dual ascent,and auction algorithms.- covers extensively the main algorithms forspecialized network flow problems, such asshortest path, max-flow, assignment, andtraveling salesman.- develops the main methods for nonlinearnetwork problems, such as convex separableand multicommodity flow problems arisingin communication, transportation, andmanufacturing contexts.- describes the main models and algorithmicapproaches for integer constrained networkproblems.- contains many examples, practicalapplications, illustrations, and exercises.

For preface and table of contents, see http://world.std.com/~athenasc/netsbook.html

Reinforcement Learning: An Introductionby Richard S. Sutton, AT&T ShannonLaboratories and Andrew G. Barto,University of Massachusetts. MIT Press,1998, 344 pages, ISBN 0-262-19398-1.

Reinforcement learning, one of the mostactive research areas in artificial intelligence,is a computational approach to learningwhereby an agent tries to maximize the totalamount of reward it receives wheninteracting with a complex, uncertainenvironment. This book provides a clear andsimple account of the key ideas andalgorithms of reinforcement learning. Thediscussion ranges from the history of thefield’s intellectual foundations to the mostrecent developments and applications. Theonly necessary mathematical background isfamiliarity with elementary concepts ofprobability.

The book is divided into three parts. Part Idefines the reinforcement learning problemin terms of Markov decision processes. PartII provides basic solution methods: dynamicprogramming, Monte Carlo methods, andtemporal-difference learning. Part IIIpresents a unified view of the solutionmethods and incorporates artificial neuralnetworks, eligibility races, and planning. Thefinal chapter presents a number of casestudies.

Combinatorial Optimization by William J.Cook, Rice University, William H.Cunningham, University of Waterloo,William R. Pulleyblank, IBM Research, andAlexander Schrijver, Centrum voorWiskunde en Informatica. John Wiley andSons, Inc, 1998, 368 pages, ISBN 0-471-55894-X.

Page 20: ICS Newsletter - Institute for Operations Research and the

20 INFORMS Computing Society Newsletter

One of the youngest, most vital areas ofapplied mathematics, combinatorialoptimization integrates techniques fromcombinatorics, linear programming, and thetheory of algorithms. Because of its successin solving difficult problems in areas fromtelecommunications to VLSI, from productdistribution to airline crew scheduling, thefield has seen a ground swell of activity inthe past decade.

Combinatorial Optimization is an idealintroduction to this mathematical disciplinefor advanced undergraduates and graduatestudents of discrete mathematics, computerscience, and operations research. Written bya team of recognized experts, the text offers athorough, highly accessible treatment of bothclassical concepts and recent results. Thetopics include network flow problems,optimal matching, integrality of polyhedra,matroids and NP-completeness. The bookfeatures logical and consistent exposition,clear explanation of basic and advancedconcepts, many real-world examples, andhelpful, skill-building exercises.

Advances in Sensitivity Analysis andParametric Programming edited by TomasGal, Fern Universität and Harvey J.Greenberg, University of Colorado atDenver. Kluwer Academic Press, 1997, 610pages, ISBN 0-7923-9917-X.

Since the beginning of linear programming ahalf century ago, sensitivity analysis hasbeen an integral part of its theory,implementations and applications. As muchas possible, these foundations have, over theyears, been extended to nonlinear, integer,stochastic, multi-criteria, and othermathematical programming, though it isconsidered that those advances have so farnot provided as rich a body of knowledge.Recent advances in mathematicalprogramming have opened up new insights

about sensitivity analysis (includingsensitivity analysis in the presence ofdegeneracy in linear programming). Theparadigm, What if....? question is no longerthe only question of interest. Often, we wantto know Why....? and Why not....? Suchquestions were not analyzed in the earlyyears of Mathematical Programming to thesame extent that they are now, and we havenot only expanded our thinking about “post-optimal analysis”, but also about “solutionanalysis”, even if the solution obtained is notoptimal. Therefore, it is now time to examineall the recent advances on sensitivity analysisand parametric programming.

This book bridges the origins of sensitivityanalysis with the state-of-the-art. It coversmuch of the traditional approaches with amodern perspective (including degeneracygraphs and the tolerance approach). It showsrecent results using the optimal partitionapproach, stemming from interior pointmethods, for both linear and quadraticprogramming. It examines the special case ofnetwork models, and presents a neglectedtopic, qualitative sensitivity analysis. Thebook also presents advances in sensitivityanalysis outside of standard linearprogramming. These include mixed integerprogramming, nonlinear programming,multi-criteria mathematical programming,stochastic programming, quadratic programs,and fuzzy mathematical programming.

INFORMS ON-LINEhttp://www.informs.org/

Subscribe to IOL-NEWS and get weekly e-mail updates. Visit INFORMS ON-LINEfor details.

Page 21: ICS Newsletter - Institute for Operations Research and the

21Fall 1998

Journal on Computing: Vol. 10, No. 4

Feature Article“The World Wide Web: Opportunities for Operations Research and Management Science”, Bhargava and KrishnanCommentaries“Dynamic, Distributed, Platform Independent OR/MS Applications -- A Network Perspective”, Bradley and Buss“Predictions for Web Technologies in Optimization”, Fourer“A New Horizon for OR/MS”, Geoffrion“The World Wide Web: It’s the Customers”, TrickRejoinder“OR/MS, Electronic Commerce, and the Virtual INFORMS Community”, Bhargava and KrishnanContributed Research Articles“Partitioning the Attribute Set for a ProbabilisticReasoning System”, Sarkar and Ghosh“An Efficient Algorithm for Solving an Air Traffic Management Model of the National Airspace System”, Boyd,Burlingame and Lindsay“Lifted Cover Inequalities for 0-1 Integer Programs: Computation”, Gu, Nemhauser and Savelsbergh“A Branch and Bound Algorithm for the Stability Number of a Sparse Graph”, Sewell“Distribution Estimation using Laplace Transforms”, Harris and Marchal

Feature Article Abstract: The World Wide Web has already affected OR/MS work in a significant way, and holdsgreat potential for changing the nature of OR/MS products and the OR/MS software economy. Web technologies arerelevant to OR/MS work in two ways. First, the Web is a multimedia communication system. Originally based on aninformation pull model, it is -- critical for OR/MS -- being extended for information push as well. Second, it is alarge distributed computing environment in which OR/MS products -- interactive computational applications -- canbe made available, and interacted with, over a global network. Enabling technologies for Web-based execution ofOR/MS applications are classified into those involving client-side execution and server-side execution. Methods forcombining multiple client-side and server-side technologies are critical to OR/MS’s use of these technologies. Thesemethods, and various emerging technologies for developing computational applications give the OR/MS worker arich armament for building Web-based versions of conventional applications. They also enable a new class ofdistributed applications working on real-time data. Web technologies are expected to encourage the development ofOR/MS products as specialized component applications that can be bundled to solve real-world problems. Effectiveexploitation, for OR/MS purposes, of these technological innovations will also require initiatives, changes, andgreater involvement by OR/MS organizations.

Fred Glover Wins von Neumann PrizeFred Glover was awarded the INFORMS John von Neumann Theory Prize at the Spring 1998 meeting in Montreal.

The presentation was made by Leon S. Lasdon, chair of the INFORMS vonNeumann Committee (see photo at left). The award was presented to Dr.Glover for “his fundamental contributions to integer programming, networks,and combinatorial optimization.” The full text of the presentation citationcan be found at http://www.informs.org/Prizes/vonneumanndetails.htm#1998.

Fred’s accomplishments are also nicely outlined in the June 1998 issue ofOR/MS Today.

The von Neumann prize is awarded annually to a scholar who has madefundamental, sustained contributions to theory in operations research and themanagement sciences. It is awarded for a body of work, typically publishedover a period of several years, reflecting contributions that have stood thetest of time. The criteria for the prize are broad, and include significance,innovation, depth, and scientific excellence. The award is $5,000, amedallion and a citation.

Fred Glover (left) and Leon Lasdon

Page 22: ICS Newsletter - Institute for Operations Research and the

22 INFORMS Computing Society Newsletter

Computing and Optimization Toolsfor the New Millennium

Seventh INFORMS Computing Society ConferenceCancún, México

January 5-7, 2000

Computer science and operations research share an important part of their history. Theirinterface is responsible for advances that could have not been achieved in isolation. Thefirst six INFORMS Computing Society (ICS) conferences witnessed fascinatingdevelopments in the computer science/operations research interface. We would like to takethis opportunity to invite you to the seventh ICS conference, which has the goal of bringingtogether researchers and practitioners in Operations Research, Computer Science,Management Science, Artificial Intelligence, and other related fields. These researchers andpractitioners will be responsible for creating the computing and optimization tools for thenew millennium.

The advisory committee for the conference is:· Bruce Golden (University of Maryland)· Harvey Greenberg (University of Colorado)· Paolo Toth (University of Bologna)· John Hooker (Carnegie Mellon University).

The conference hotel will be the Westin Regina Cancún, situated just up the beach from thePunta Nizuc natural reef. Anyone interested in organizing sessions or tutorials shouldcontact either one of us. We hope to continue the tradition of excellence established inprevious ICS conferences, for which we need to count with your support and participation.For additional information, please visit the conference web site at http://www-bus.colorado.edu/Faculty/Laguna/cancun2000.html

General Co-Chairs:Manuel LagunaUniversity of [email protected]

José Luis González VelardeMonterrey [email protected]

Page 23: ICS Newsletter - Institute for Operations Research and the

23Fall 1998

Upcoming Meetings

INFORMS Spring 1999 Meeting, CincinnatiConvention Center and Omni Netherland Plaza,Cincinnati OH, May 2-5, 1999. Theme:Delivering to the Global Consumer. Featurescomprehensive coverage of OR/MS topics whilealso striving for increased emphasis in pedagogyand practice. General Chair David Rogers,([email protected]), URL=http://www.informs.org/Conf/Cincinnati99/

Fourth Conference on Information Systemsand Technology, Cincinnati OH, May 2-5, 1999,sponsored by the INFORMS Colleges onInformation Systems and Artificial Intelligence.URL=http://www.cob.ohio-state.edu/~rolland/cist-99/main.htm

Winter Simulation Conference (WSC ‘98),Grand Hyatt Hotel, Washington DC, December13-16, Theme: “Simulation in the 21st Century”.URL=http://www.wintersim.org

19th IFIP TC7 Conference on SystemModelling and Optimization, Cambridge,England, July 12-16, 1999.

Society for Medical Decision Making (SMDM),Reno, NV, October 2-5, 1999.

INFORMS Fall 1999 Meeting, PhiladelphiaMarriott, Philadelphia PA, November 7-10, 1999.

INFORMS/CSTS 7th Biennial Conference,Cancun Mexico, January 2000. General Chair:Manuel Laguna.

INFORMS Spring 2000 Meeting, San FranciscoCA.

INFORMS/KORS, Seoul, South Korea,Summer 2000.

News about Members

Colin Bell ([email protected]) left his positionas Professor of Management Sciences and AssociateDean at the University of Iowa for a position in theMicrosoft Project Development group at MicrosoftCorporation.

Ismail Chabini ([email protected]) is the winner of anNSF Career award starting in June 1998.

Shyam ([email protected]) and Veena Chadha at theUniversity of Wisconsin, Eau Claire, with severalstudents, investigated the distribution problem of theWalmart distribution center located at Menomonie, WIin the spring of 1998. A mathematical model wasdeveloped using ten cities of the region. This projectwas funded by the Student Research CollabrationGrants of the University of Wisconsin, Eau Claire.

Jonathon Eckstein ([email protected]) hasbeen granted tenure and promoted to AssociateProfessor at Rutgers University. In addition, he wasawarded the Rutgers University Board of TrusteesFellowship for Scholarly Excellence in May 1998 andhis second child was born in March 1998.

Anna Nagurney ([email protected]) has beennamed to an endowed chair position at the University ofMassachusetts. Her title is John F. Smith MemorialProfessor. This professorship is funded by a gift fromJack Smith, CEO of General Motors, to honor thememory of his father.

Marcel F. Neuts ([email protected]) isdeveloping a research program in computerexperimentation for probability models. He is alsostarting a news bulletin in this area - in addition to theexisting Stochastic Models News and the Matrix-Analytic Bulletin. For the year 2000, he is planning aspecial issue of Stochastic Models on the methodologyof computer experimentation. He also offers consultingand educational services in applied probability and itscomputational aspects.

S. Raghavan ([email protected]) movedfrom his position as Acting Director, OptimizationGroup, at U S WEST Advanced Technologies, BoulderCO, to take up a faculty appointment in the Decisionand Information Technologies Department, within theRobert H. Smith School of Business at the University ofMaryland, College Park MD.

Page 24: ICS Newsletter - Institute for Operations Research and the

24

PRE-SORTEDFIRST-CLASS POSTAGE

PAIDBALTIMORE, MDPERMIT NO. 4537

901 Elkridge Landing Road, Suite 400Linthicum, MD 21090-2909

The GAMS Short Course:Optimization Modeling and Problem-Solving

Using the General Algebraic Modeling System

Date: January 11-14, 1999 Location: Carmel, CaliforniaInstructor: Dr. Richard E. Rosenthal

Here’s what others have said about Dr. Rosenthal’s course:“Dr. Rosenthal does a great job of “Dr. Rosenthal has many years of “Truly one of the most enjoyable “Excellent course!bridging the gap between academia experience with a remarkable and instructive courses I have ever I wish I went to itand the corporate world. His broad range of real world attended. Fine balance between a year ago. I couldinstruction is excellent, and the real- applications. His course is loaded optimization theory, GAMS have saved a tonworld optimization models he shares with valuable examples, anecdotes, implementation and managerial of time andwith the class are invaluable.” and insights.” practice.” money.”Margery Connor Dr. Michael Saunders Howard Mason Allan MettsChevron Stanford University Bankers Trust BellSouth

For detailed course description and registration information, contact GAMS Development Corp. (tel: 202-342-0180, fax: 202-342-0181) or view web site http://www.gams.com

Now in its eleventh year