High accuracy computation with linear analog optical ... · quantities in a digital number system in which a single ... cases of binary or decimal systems, involve interaction between

High accuracy computation with linear analogoptical systems: a critical study

Demetri Psaltis and Ravindra A. Athale

High accuracy optical processors based on the algorithm of digital multiplication by analog convolution(DMAC) are studied for ultimate performance limitations. Variations of optical processors that performhigh accuracy vector-vector inner products are studied in abstract and with specific examples. It isconcluded that the use of linear analog optical processors in performing digital computations with DMACleads to impractical requirements for the accuracy of analog optical systems and the complexity of post-processing electronics.

1. Introduction

Analog optical processors can be used in a variety ofways to implement linear transformations and filter-ing useful in signal processing problems with very highthroughput requirements. Among the unique fea-tures of optics exploited in such processors are (1) 1-Dor 2-D parallelism, (2) ease of performing complexmultiplication and addition, and (3) global and arbi-trary communication between the parallel channels ofcomputation. The most notable successes of analogoptical processing have been synthetic aperture radarprocessing,1 acoustooptic processors for rf spectrumanalysis,2 and correlations.3 One of the most impor-tant performance parameters of these systems is thelinear dynamic range defined as the ratio of the highestallowable input signal level where nonlinear distor-tions appear to the lowest input signal level that pro-duces an output signal (e.g., the correlation peak or thespectrum) equal to the random noise at the output dueto detector dark current, scattered light, etc. The useof heterodyning techniques in output detection has ledto optical systems with 70 dB of linear dynamic rangein rf signal power.4

A traditional approach to improving computationalaccuracy of any system is to represent measurablequantities in a digital number system in which a singlelarge number is represented by an ordered n-tuple of

Demetri Psaltis is with California Institute of Technology, Pasa-dena, California 91125, and R. A. Athale is with BDM Corporation,7915 Jones Branch Road Drive, McLean, Virginia 22102.

Received 11 March 1986.0003-6935/86/183071-07$02.00/0.© 1986 Optical Society of America.

small numbers [a - (a1,a2, ... an)]. The encoding anddecoding can be a complicated process determined bythe particular number system used. The most popu-lar digital representations are the binary and decimalnumbers systems, even though other systems such asresidue arithmetic have been used. Since in a digitalnumber system several small numbers are used to rep-resent a large number, the dynamic range limitationsof the analog system can be overcome, and it becomespossible to represent large numbers accurately. Theoperations of multiplication and addition in the digitalnumber system also break down into several small sizecalculations. The individual subcalculations, howev-er, involve nonmonotonic nonlinearities and, in thecases of binary or decimal systems, involve interactionbetween adjacent digits in the form of carry propaga-tion. Thus, in terms of optical processing, the use ofdigital number representation involves more than asimple trade-off between accuracy and parallelism(and, therefore, speed). It introduces nonmonotonicnonlinear operations, which are not easily implement-ed with optics.

During the past decade several research projectshave been undertaken to develop nonlinear opticalcomponents and systems to perform computation.Optical logic is a growing field in which devices withimproved performance are being developed continu-ously. The application of these logic devices to nu-merical computation in binary number system, howev-er, has seen only limited development. The residuenumber system has also been investigated for opticalrealization using integrated optical switches, diffrac-tion gratings, and liquid crystal light values.5-8 Morerecent developments have used holographic table look-up processors with binary as well as residue numbersystems9 and have proposed use of threshold logicgates in implementing the truth tables for binary mul-tiplication and addition.l

15 September 1986 / Vol. 25, No. 18 / APPLIED OPTICS 3071

When and if numerical optical processors that uti-lize nonlinear optical devices will be practically usefulin the near future is a very important question, which isnot addressed in this paper. Instead we will investi-gate another approach in which linear analog opticalsystems are used to perform partial calculations andproduce an intermediate result that can be convertedto standard digital representation with appropriatenonmonotonic nonlinear operations. This approachhas a long history, and the first reference to thesetechniques is found in the ancient Indian scriptures ofthe Vedas (2000 years old) as a shortcut to largenumber multiplications." In more recent times,Schwartzlander suggested it to the electronics commu-nity as a "quasi serial multiplier," 12 and Whitehouseand Speiser13 introduced it to the optical signal pro-cessing community under the name of "digital multi-plication by analog convolution" or DMAC for short.13The first optical implementation was carried out byPsaltis et al.,14 and schemes to incorporate it in opticallinear algebra processors were forwarded by Guil-foyle15 and Collins et al.16 In the last four years nu-merous modifications to the basic idea have been sug-gested with different predicted performances.17-22 Inthis paper we will examine in detail the trade-offsinvolved in performing the digital calculations by lin-ear analog optical systems. Section II contains a gen-eral trade-off analysis for a generic system. SectionIII contains several specific examples of high accuracyoptical processors with a common technology sub-strate and a common throughput performance to fa-cilitate the extraction of basic limitations that cannotbe changed by architectural ingenuity. Section IVdetails the conclusions of this study and makes somerecommendations.

II. General Trade-Offs

We will consider the trade-offs associated with ageneric system designed to perform linear operationson binary formatted data. The binary encoding ofdata has three immediate consequences. The first isan increase in the number of elementary operationsthat need to be performed compared to an analogimplementation. Each sample is represented by sev-eral bits, and we normally need to operate on each bitseveral times to perform the desired calculation.Thus, given the same system resources (space-band-width product, temporal bandwidth, and dynamicrange), we end up performing fewer multiplicationsand additions per second with the binary encoded dataand compared to the analog encoding. This loss inprocessing speed is generally acceptable if it can betraded off for improved accuracy. The work in apply-ing the digital multiplication by analog convolution(DMAG) algorithm to optical linear algebra processorswas motivated by exactly this hope.

Unfortunately, the decrease in computationalthroughput is not the only consequence of the binaryencoding of data. Even though the requirement ofperforming nonmonotonic nonlinear operations andpropagating carries can be postponed by using the

DMAC algorithm, it cannot be avoided entirely. Thistask falls to the electronic postprocessor that has tohandle the signals coming out of the optical detectorarray and convert them into standard binary number.A trade-off analysis must incorporate the performancerequirements and complexity of the postprocessingelectronics for proper evaluation of the possible com-parative advantages of the optical processor.

A third factor to be considered is the performancerequired of the linear analog optical processor. Al-though the input data are encoded in binary, the ab-sence of quantizing and standardizing procedureswithin the optical system implies that the optical sig-nals at the output will have multiple levels. The num-ber of levels that need to be resolved with low probabil-ity of error will be governed by the number bits used inthe binary representation and the number of parallelmultiplications carried out by the optical processor.Since these two quantities also determine the through-put and accuracy of the processor, the dynamic rangeand accuracy of the analog optical processor play acrucial role in determining the ultimate performance.

A. Description of a Generic SystemWe will first examine the DMAC algorithm in detail.

Let f and g be two integers and f and gi be the binarystrings that represent the two numbers in a binarynumber system. Therefore,

L-1 L-1

f=Efi2i; g=E gi2. (1)i=o i=O

If the product of the two numbers is denoted by h,(2L-2) (L-1)

h=f.g= E 2i E (fjgij). (2)i=O j=O

This simple relationship provides us with the DMACalgorithm for multiplying two binary numbers using alinear optical system. The two binary strings are con-volved with each other, and the final answer is ob-tained by weighting each term of the convolution withan exponential factor (2i) and summing over all terms.We normally perform the convolution using an analogoptical system to obtain the coefficients of expansionand skip direct evaluation of the summation over i inEq. (2) to avoid the explosion in dynamic range thatresults from the exponential weights. We are thusevaluating the nonbinary string hi which representsthe integer h.

L-1

hi = > fjgi-jj=O

(3)

The analog string can be converted into a standardbinary string by using an analog-to-digital (A-D) con-verter and then performing binary addition with ap-propriate shifts of the binary strings corresponding tohi. The block diagram of the system performing thisoperation is shown in Fig. 1.

This idea can be generalized to allow implementa-tion of any linear transformation where the basic arith-metic operation involved is a sum of products. The

3072 APPLIED OPTICS / Vol. 25, No. 18 / 15 September 1986

f I Ol'2L-I SHIFT- - hoOPTICAL hoh AND-II-AND

g0 g 91.. L-1 CONVOLVER A/D , ADD hCIRCUIT h(2L-1)

BINARY ANALOG BINARY BINARY

Fig. 1. Block diagram of the DMAC multiples.

sum of two binary encoded numbers can be obtainedby simply adding without carries the equivalent bitspairwise with an analog processor and then performingthe operation of A-D conversion and binary additionwith shifts that we described in connection with themultiplication. Now we can merge these techniques ofdigital multiplication and digital addition by linearanalog optical systems combined with a digital elec-tronic postprocessor and perform the canonical opera-tion of an inner product between two n-element col-umn vectors f and g. The elements of the vector arerepresented by binary strings. Therefore, fik and gikcorrespond to the ith bits of the kth elements of vectorsf and g, respectively. The inner product between fand g is a scalar h given by

Nh = fTg = fkk

k=1

N (2L-2) L-1

= E X 2i fJkg(i-J),k. (4)k=1 i=O j=O

This equation can be rewritten by rearranging theorder in which the three summations are performed:

2L-2 N L-1 (

h = E 2i E E fjxk9(i-j),k 0-(5)i=O j=1 IJ=o J

The term in the bracket is 2-D linear operation that isimplemented with an analog optical processor, and theresult is then digitized using the sequence of opera-tions described earlier to produce the binary stringrepresenting h while avoiding the explicit evaluation ofthe exponential weights. Again the decompositionsuggested here is not unique. One could choose to addfewer bits optically by digitizing after only one of thesummations or partial sums is performed. As we willsee shortly, the number of bits that are accumulated toform each output sample is a crucial parameter thatdirectly affects the accuracy of the processor.

A schematic diagram of a generalized optical proces-sor that performs linear operations using the DMACalgorithm is shown in Fig. 2. The system has N,parallel spatial channels at the input, and each channelaccepts binary bits at a rate B1. Each channel mayaccept bits every clock cycle from a separate externalinformation source or from the adjacent channel. Theinformation in each channel is multiplied by M sepa-rate bits in the optical processor. We call M the fan-out factor since it is generally equal to the number ofoutput channels that are illuminated by light emittedfrom each input channel. The system has N2 paralleloutput channels each having temporal bandwidth B2.The binary products that are optically formed are

N1 INPUT CHANNELS

B. INPUT BANDWIDTH

BINARY ' -SIGNALS -IN , 1 _

N2 OUTPUT CHANNELS

B2 OUTPUT BANDWIDTH

BINARYSIGN ALSOUT

OPTICALSYSTEM

Fig. 2. Schematic diagram of a generalized optical processor.

accumulated (added) at the output plane through ei-ther spatial or temporal integration to form a lineartransformation of the binary input data. If NM > N2,the system performs spatial integration since multiplebit products are detected at the same spatial location.If B, > B2, the system is time integrating. Both condi-tions can hold simultaneously, in which case the lineartransformation is performed through a combination oftemporal and spatial integrations. The signal detect-ed at each output channel is electronically converted tothe binary representation by the A-D converter and ashift-and-add circuit.

Having defined these parameters we can character-ize the performance of any specific architecture with-out further knowledge of the details of the implemen-tation. Thus we will be able to derive some guidelinesthat are generally applicable. The number of bit mul-tiplications that the processor performs per unit timeis equal to NBM. The number of bit multiplicationsthat are required to realize one multiplication betweentwo integers with DMAC is aL2, where a is a constantbetween 1 and 4, depending on the efficiency of thespecific implementation, and L is the number of bitsthat are used to represent each number. The process-ing power P of the overall system is

P = N1B1M/aL2 multiplications/s. (3)

Clearly, P is a number we wish to maximize. Also notethat Eq. (3) demonstrates the trade-off between accu-racy and processing speed referred to earlier. Thenumber of analog-to-digital conversions that need tobe performed per unit time is

C = N 2 B2 A-D conversions/s. (4)

Again it is clear that C is a number we wish to minimizeto keep the complexity of the electronics at a mini-mum. Unfortunately, P and C cannot be indepen-dently chosen. To appreciate this fact, we define theratio

R = P/C = (\N 2BaL) multiplications/A-D conversion.

The ratio R increases monotonically as either P in-creases or C decreases. Therefore, we want to make Ras large as possible. R provides a direct comparativeestimate of the optical implementation vs an all elec-tronic implementation. If, for example, R were equalto one, only one binary multiplication is being per-formed by the optical system per A-D conversion.Since it is about equally difficult to perform multipli-


Table I. Design Parameter Common to all Architectures

Vector/matrix dimensions N = 32Input accuracy M = 16 bitsInput accuracy (per element) B = 50 Mbits/sOverall throughput C = 50 MOPSOutput accuracy 21-22 bits

cations and A-D conversions electronically, this wouldbe a strong indication that optics offers no advantagein this case. To determine the maximum value for Rwe consider the characteristics of the output stage ofthe processor. The number of bits that are beinggenerated per unit time within the optical processor isequal to (NBM). The number of samples that arebeing transferred out of the optical processor per unittime is N2B2 (= C). The ratio (NBM 1M)/(N2B2) is,therefore, equal to the maximum number of bits thatare accumulated (through either spatial or temporalintegration) to form each output sample. Therefore,this ratio cannot exceed the accuracy of the systemDR2, which is defined as the number of distinguishablesignal levels that can be produced reliably by the de-tector and A-D converter. This gives us a very simpleupper bound for R:

R < DR2 /aL2 . (6)

For example, if L = 10 and a = 1, we require thenumber of distinguishable levels at the output to be atleast 100 for R to be equal to 1. It is important to notethat the output levels must be sufficiently well definedso that the A-D converter can detect all of them withvery low probability of error. DR2 is much smallerthan what is conventionally called the detector dy-namic range. We estimate that it would require verysophisticated engineering to obtain DR2 = 100, and itdoes not appear that DR2 = 1000 is practically feasiblein the foreseeable future.

The pessismistic result from the above discussionregarding the viability of DMAC-based optical alge-braic processors is based on the assertion that an ana-log-to-digital conversion is approximately equallycostly as an electronically implemented binary multi-plication. We can accept this without further qualifi-cation if the two operations are performed at the samespeed and with the same number of bits. Throughsome specific examples we will consider in the follow-ing section, we will see that this is not always the case.It is appropriate then to ask whether we can derive apossible advantage by replacing fast digital multiplierswith a larger number of slower ADCs that perhaps alsohave a smaller word length. The answer to this ques-tion is a qualified no. The reason is that we cangenerally find a digital implementation that also usesmore but slower multipliers that can solve the sameproblem at the same overall processing rate. Thequalification is that in problems where the input bina-ry data are available at a very high data rate, a time-integrating optical processor can accept the input anddeliver it at slower rates to the ADCs. In this ratherspecialized situation optics may help avoid the need

(a)

h

I-DSPATIAL LIGHT MODULATORS POINT DETECTOR

(SLM)h=X fj gi

(b)

g~~~~

POINT MODULATOR -D SLM TIME-INTEGRATINGDETECTOR ARRAY

hi= g fi

Fig. 3. Schematic diagrams of optical systems for performing avector-vector inner product (a) and scalar-vector product (b). Avector-matrix multiplication is performed by repeatedly performing

either operation.

for high speed multipliers. In general, however, thereis not a clear advantage in DMAC-based optical pro-cessors over electronic systolic arrays, and thus we donot expect these systems to have a broad significantimpact in signal processing.

Ill. Specific Examples

In this section we elaborate on the conclusions of theprevious section through specific examples. The par-ticular linear operation chosen is a vector-matrix mul-tiplication, since a large number of more complex oper-ations can be decomposed in terms of this operationand also since it is useful by itself. Acoustooptic Braggcells are selected as the active optical devices, sincethey represent the most mature technology and sincemost of the previous work in this area was also based onthis technology. The architectures to be consideredhere handle all the elements of an input vector inparallel and hence can be thought of as 1-D processors.However, since each element is represented by severalbits, the optical system has to use the second spatialdimension as well, thus physically giving a system thatis 2-D in nature. Table I shows the performance pa-rameters common to all the architectures to be ana-lyzed here. These values were chosen in view of thecurrent state-of-the-art in device technology. Thesame computational throughput will be shown to re-quire different levels of system performance for differ-ent architectures.

The operation of vector-matrix multiplication canbe implemented on a 1-D processor using two differentstrategies: (1) space-integrating architecture basedon a vector-vector inner product; (2) time-integratingarchitecture based on a scalar-vector product. Theschematic diagrams of these two systems are shown inFig. 3. In the space-integrating architecture, the vec-tor g and rows of the matrix F [i) is the ith row of F]are input in parallel, while the output vector h is calcu-lated sequentially one element at a time. In the time-integrating architecture, on the other hand, the vectorg is entered sequentially, and columns of F [f(i) is thejth column of F] are input in parallel, while the output


b 1b 2b

a2 J 4 c2 c1 cO

A-O DEFLECTORS POINT DETECTOR

ELEMENTS

BITS ..- I 1j1II I BITS

ELEMENTS F ()

(b) I~bj b2

a2 TC COI C2 C3C4

A-O DEFLECTORS T-1 DETECTORARRAY

Fig. 4. Two ways of performing digital multiplication via linear

analog optical systems: (a) space-integrating convolver; (b) time-integrating convolver.

Fig. 5. Schematic diagram of a system employing a vector-vector

inner product with space-integrating convolution.

91

ELEMENT ELEMENTS

BITS

BITSBTS:~-I

F DETECTOR ARRAY

Fig. 6. Schematic diagram of a system employing a scalar-vector

product with time-integrating convolution.

vector h is accumulated in parallel in the time-inte-grating detector array.

The convolution operation needed to implement thedigital multiplication can similarly be performed usingspace integration or time integration. The schematicdiagrams of these two systems are shown in Fig. 4. Itshould be noted that one of the binary sequences has tobe reversed with respect to the other in the time-integrating convolver.

The two choices each for the matrix operation andthe digital multiplication can be combined to producefour architectures for high accuracy vector-matrixmultiplication. These are

(1) inner product/space-integrating convolution;(2) scalar-vector product/time-integrating convo-

lution;(3) inner product/time-integrating convolution;(4) scalar-vector product/space-integrating convo-

lution.Figures 5-8 depict these four processors. The mul-

titransducer acoustooptic Bragg cells shown in the fig-ures are assumed to have thirty-two parallel channels(the size of the vector) with a time-bandwidth productof (2L - 1), where L is the number of bits (sixteen inthis case). The requirements for the active devices inthese four systems are widely different. Although allthe active devices need to be considered for a completesystem design, we will concentrate on the complexityof the postprocessing electronics and the analog accu-racy of the optical processor since we have alreadyestablished that this part of the processor is the mostcrucial in determining the practicality of the system.Table II contains a list of the relevant parameters forall the processor architectures. In what follows, wediscuss each system briefly and explain the origin ofthe parameter values obtained in each case.

(1) Inner product/space-integrating convolution:In this architecture (Fig. 5) space integration is usedfor the summation of the vector elements and for theconvolution. The output vector h is thus produced ina bit as well as element sequentially fashion. Since

each analog bit of each element of h involves summa-

hi9

BTS

IL~I BITS r (REVERSE

ORDER) T-1IF(') DETECTOR

ARRAY

Fig. 7. Schematic diagram of a system employing a vector-vectorinner product with time-integrating convolution.

Yi ELEMENTS h (partial)EELEMENTS

[1 ~~~HI HIH SPEED~~~~~BITS a-e

DETECTORF I'|) ARRAY

Fig. 8. Schematic diagram of a system employing a scalar-vector

product with space-integrating convolution.

Table II. Parameters for Processor Architectures

Architectures I II III IV

No. of 1 1024 31 32detectors/A-D

Bandwidth 50 MHz 50 kHz 1.5 MHz 50 MHzper channel

Accuracy 9 bits 9 bits 9 bits 4 bitsper channel

Additional Shift-add Shift-add Shift-add Shift-addpostpro- (32 bits) (32 bits) (32 bits) + accum-cessing ulator

(32 bits)

tion over sixteen bits of the input vector element aswell as summation of thirty-two elements of the inputvector, the single detector and ADC will be required toresolve 512 levels (or nine bits). A new bit is calculatedat the output for every clock cycle of the input device,


(a)

POINT DETECTOR

h nntil

� E�E

and hence the detector and ADC bandwidth will be 50MHz.

(2) Scalar-vector product/time-integrating convo-lution: In this architecture (Fig. 6) time integration isused for the vector-element summation and the convo-lution. The output vector h is thus produced in a bit-and element-parallel fashion at the end of the integra-tion period. The number of levels to be resolved bythe individual detector element and the ADC associat-ed with it still remains at 512, since that is totallydetermined by the size of the problem and is indepen-dent of the method (space or time integration) used toproduce the final answer. Since all the bits of all theelements of the vector h are accumulated in parallel a32 X 31 2-D time-integrating detector array is neededin the output with an ADC associated with each detec-tor element. The bandwidth per channel is now re-duced to -50 kHz.

(3) Inner product/time-integrating convolution:In this architecture, space integration is used for thevector index summation, and time integration is usedfor the convolution. The output vector h is thus pro-duced in a bit parallel and element sequential fashion.Each element of h is fully calculated (all bits accumu-lated in parallel) after all the bits of a row of matrix areinput to the processor, i.e., after thirty-one clock cyclesof the input devices. The 1-D time-integrating detec-tor array and associated ADC containing thirty-oneelements will now have a bandwidth of approximately-1.6 MHz. The number of levels to be resolved stillremains at 512.

(4) Scalar-vector product/space-integrating convo-lution: In this architecture, time integration is usedfor the vector-index summation and space integrationfor convolution. The output vector h is thus producedin bit sequential and element parallel fashion. Theoptical system is not utilized in performing the timeintegration for the vector-index summation. Thattask is delegated to thirty-two digital accumulators(one for each element of the output vector h). Sincethe output of one channel of the acoustooptic space-integrating convolver corresponds to an analog bitstream representing one element of the scalar-vectorproduct [gif(i)], it will have only sixteen resolvable lev-els (4 bits). Hence the 1-D high bandwidth detectorarray and associated A-D array will need to resolveonly sixteen levels. Since a new sample in the outputis produced for each new bit in the input, the band-width per channel will be 50 MHz. A shift-and-addcircuit is needed behind each A-D converter to pro-duce the time binary representation for each elementof the scalar-velocity product. The binary words re-sulting from successive scalar-velocity products willthen be added in an accumulator array, which will haveto be 32 bits wide and operate at -1.6 MHz.

These four examples serve to illustrate the differenttrade-offs among the bandwidth, number of channels,and levels per channel associated with the postpro-cessing electronics including the detectors. The prod-uct of these three parameters is invariant among thesefour architectures and equal to 2.56 X 1010 levels/s.

This is also equal to the total input data rate for theprocessor (32-element vector, 16 bits/elements, 50-MHz bit rate/channel), which is to be expected sincethe vector-matrix multiplication is a linear transfor-mation and does not involve any data reducing opera-tions. The first three architectures demonstrate atrade-off between the bandwidth and number of chan-nels while leaving the number of levels produced perchannel constant. The fourth architecture reducesthe number of levels per channel at the expense ofincreasing the number of channels and the bandwidthper channel. Since the number of levels to be reliablyresolved by a detector is limited by the analog accuracyof the optical processor, this may seem to provide thebest solution. A close examination reveals, however,that this trade-off is only achieved by performing theaccumulation digitally, thus reducing the computationperformed by the optical system. Thus the last archi-tecture, although most practical, will suffer when com-pared with an all electronic architecture.

There are numerous other variations on these archi-tectures for high accuracy optical processors that in-volve using outer products for digital multiplicationsor using systolic or engagement architectures for vec-tor-matrix multiplications. They will affect the de-vice requirements and data-flow characteristics of theprocessor. But they will not change the overall pictureconcerning the required sophistication for the postde-tection electronics.

The conclusion of this study of four specific archi-tectures is that the number of levels per second thatthe optical processor needs to generate and the elec-tronic postprocessor needs to handle is totally fixed bythe computational throughput of the system. Thusthe only way of achieving a very high throughput is tohave a very high performance electronic postprocessorand a very high accuracy analog optical system. Bothof these requirements negate the basic goal of employ-ing digital encoding to build a high-throughput, high-accuracy optical processor that is far superior to an all-electronic implementation. Another secondaryconclusion is that the only way of reducing the require-ments on the analog accuracy of the optical system is toreduce the amount of computations performed by it(multiplication without the summation).

IV. Conclusion

It is quite apparent from the previous discussionthat the use of linear analog optical processors in pro-cessing digitally encoded data leads to unacceptablerequirements on the analog accuracy of the opticalsystem and on the complexity and performance of theelectronic postprocessor. As with other situations inlife, a difficult task that cannot be avoided all togetherbecomes progressively more difficult when it is post-poned further. The nonmonotonic nonlinear opera-tions are an integral part of processing digitally en-coded data. The more operations one performswithout this step, the more difficult the nonlinear op-erations become. On the other hand, if the electronicnonlinear operation is performed frequently, the role


HIGH-BANDWIDTHF ~~~~DETECTOR

h

MULTI -TRANSDUCERACOUSTO - OPTICPOINT MODULATORS

Fig. 9. Space-integrating analog processor for performing vector-

matrix multiplication.

of optics diminishes compared to the electronics, put-ting in question the reason for using optics at all!

It is sometimes suggested that the use of a baselarger than 2 for the fixed-radix digital numberrepresentation will require fewer channels for repre-senting a large number and hence will lead to a moreefficient optical processor. This will indeed be thecase if we are only interested in minimizing the space-bandwidth requirement of the optical system to per-form operations with given accuracy. As we saw inprevious sections, however, the number of levels thatneed to be reliably produced by an optical system willalso have to be minimized for a practical system.Therefore, the cost function will involve total levelsrequired to represent a number with a given accuracyand can be chosen to be equal to (number of digits) X(base-1). The minimum cost for a given accuracy hasbeen shown to occur for base = 3 and is only 5% belowthe cost for base = 2.25 Therefore, it is apparent thatlittle will be achieved by going to a higher value for thebase.

It will be instructive to compare the throughput of apurely analog optical system that uses the same tech-nology as the four architectures described in Sec. III.Figure 9 shows a space-integrating inner/product pro-cessor built with thirty-two-channel point modulatorarrays and a high speed photodetector. If we assume50-MHz analog bandwidth/channel and 9-bit accuracyfor the optical system, the computation throughputwill be 1.6 X 109 multiply adds/s at 9-bit accuracy.The output of the detector can be used in furtherstages of computation without additional postprocess-ing electronics. This throughput is highly attractive,especially when obtained with an optical processorthat could be very compact and require low power.

The linear analog optical processors thus seem bestsuited for applications that do not demand high accu-racy but put a premium on high computationalthroughput in a small volume with low power con-sumption.

The authors would like to thank R. C. Williamson ofLincoln Laboratory for pointing out the central role ofthe total number of levels produced by an optical sys-tem and for numerous insightful comments.

References1. L. J. Cutrona, E. N. Leith, L. J. Porcello, and W. E. Vivian, "On

the Application of Coherent Optical Processing Techniques toSynthetic Aperture Radar," Proc. IEEE 54, 1026 (1966).

2. T. M. Turpin, "Spectrum Analysis Using Optical Processing,"

Proc. IEEE, 79 (1981).3. Special Issue on Acoustooptics, Proc. IEEE (Jan. 1981).

4. N. J. Berg, J. N. Lee, M. W. Casseday, and E. Katzen, in Ultra-

sonics Symposium Proceedings, IEEE Catalog No. 78 CH 134-

1SU (1978), p. 91.5. A. Huang, Y. Tsunoda, J. W. Goodman, and S. Ishihara, "Optical

Computation Using Residue Arithmetic," Appl. Opt. 18, 149

(1979).6. A. Tai, I. Cindrich, J. R. Fienup, and C. C. Aleksoff, "Optical

Residue Arithmetic Computer with Programmable Computa-tion Modules," Appl. Opt. 18, 2812 (1979).

7. D. Psaltis and D. Casasent, "Optical Residue Arithmetic: ACorrelation Approach," Appl. Opt. 18, 163 (1979).

8. S. A. Collins, Jr., "Numerical Optical Data Processing," Proc.

Soc. Photo-Opt. Instrum. Eng. 128, 313 (1977).

9. C. C. Guest and T. K. Gaylord, "Truth-Table Look-up OpticalProcessing Utilizing Binary and Residue Arithmetic," Appl.

Opt. 19, 1201 1980.10. R. Arrathoon and M. N. Hassoun, "Optical Threshold Logic

Elements for Digital Computation," Opt. Lett. 9, 143 (1984).

11. "Vedic Mathematics," Shankaracharya of Govardhana Pitha,Motilal Banarsidass Pub., New Delhi, ISBN: 0-89581-416-1

(1965).12. E. E. Schwartzlander, Jr., "The Quasi-serial Multiplier," IEEE

Trans. Comput. C-22, 317 (1973).

13. H. J. Whitehouse and J. Speiser, "Linear Signal ProcessingArchitectures," in Aspects of Signal Processing with Emphasison Underwater Acoustics, Vol. 2, G. Tacconi, Ed. (Reidel, Hing-

ham, MA, 1977).14. D. Psaltis et al., "Accurate Numerical Computation by Optical

Convolution," Proc. Soc. Photo-Opt. Instrum. Eng. 232, 151(1980).

15. P. S. Guilfoyle, "Systolic Acousto-optic Binary Convolver," Opt.

Eng. 23, 20 (1984).16. W. C. Collins, R. A. Athale, and P. D. Stilwell, "Improved Accu-

racy for Optical Iterative Processor," Proc. Soc. Photo-Opt.

Instrum. Eng. 352, 59 (1983).17. R. P. Bocker, "Optical Digital RUBIC (Rapid Unbiased Bipolar

Incoherent Calculator) Cube Processor," Opt. Eng. 23, 26(1984).

18. K. Wagner and D. Psaltis, "A Space-integrating Acousto-optic

Matrix Multiplier," Opt. Commun. 52, 173 (1984).

19. A. P. Goutzoulis, "Systolic Time-Integrating Acoustooptic Bi-

nary Processor," Appl. Opt. 23, 4095 (1984).

20. S. Cartwright and S. C. Gustafson, "Convolver-based Optical

Systolic Architectures," Opt. Eng. 26, 59 (1985).

21. C. M. Verber, "Integrated Optical Architectures for Matrix Mul-

tiplications," Opt. Eng. 24, 19 (1985).22. J. Jackson and D. Casasent, "Optical Systolic Array Processor

Using Residue Arithmetic," Appl. Opt., 22, p. 2817 (1983).

23. M. S. Mort, "Modified Quasi-Serial Multiplier," Appl. Opt. 24,

1396 (1985).24. R. A. Athale, W. C. Collins, and P. D. Stilwell, "High Accuracy

Matrix Multiplication with Outer Product Optical Processor,"Appl. Opt. 22, 368 (1983).

25. S. L. Hurst, "Multiple Valued Logic-Its Status and Its Future,"IEEE Trans. Comput. C-33, 1160 (1984).

0


Documents

High accuracy computation with linear analog optical ... · quantities in a digital number system in which a single ... cases of binary or decimal systems, involve interaction between