21
Reducing the computation time in(short bit width) 2’s complement multipliers

ppt(2)

Embed Size (px)

DESCRIPTION

multiplier

Citation preview

Slide 1

Reducing the computation time in(short bit width) 2s complement multipliersAim

Design of Reducing the computation time in (short bit width) 2s complement multipliersAbstractTwos complement multipliers are important for a wide range of applications. In this paper, we present a technique to reduce by one row the maximum height of the partial product array generated by a radix-4 Modified Booth Encoded multiplier, without any increase in the delay of the partial product generation stage. The proposed method is general and can be extended to higher radix encodings, as well as to any size square and m x n rectangular multipliers. We evaluated the proposed approach by comparison with some other possible solutions; the results based on a rough theoretical analysis and on logic synthesis showed its efficiency in terms of both area and delay.

Project Overview Identify the Architecture From the literature survey Model the Architecture into RTL [register transfer level]modeling Verify the functionality of Modeled architecture in MODELSIM Synthesis the verified design in Xilinx ISE Generation of Bit map file for Dump into Spartan 3E FPGA Program the Bit map file into FPGA. Post simulation in ChipScope pro. The MAC(Multiplier and Accumulator Unit) is used for image processing and digital signal processing (DSP) in a DSP processor.The multiplier and multiplier-and-accumulator (MAC) are the essential elements of the digital signal processing such as filtering, convolution, and inner products. The MAC on specific processor cannot be run at 100% efficiency. Introduction Because they are basically accomplished by repetitive application of multiplication and addition, the speed of the multiplication and addition arithmetics determines the execution speed and performance of the entire calculation. Due to the reasons of lower speed of MAC, To improve speed of MAC on specific processor, MAC needs to be fast must have special algorithm for "multiplication" instruction.

Literature surveyIn the previous architecture proposed , the critical path was reduced by eliminating the adder for accumulation and decreasing the number of input bits in the final adder. While it has a better performance because of the reduced critical path compared to the previous MAC architectures, there is a need to improve the output rate due to the use of the final adder results for accumulation.An architecture to merge the adder block to the accumulator register in the MAC operator was proposed in [18] to provide the possibility of using two separate N/2-bit adders instead of one N-bit adder to accumulate the N-bit MAC results which increased hardware.

Modified Booth Encoding

A possible implementation of the radix-4 MBE and of the corresponding partial product generation is shown in Fig. which comes from a small adaptation. For each partial product row, Fig. 1a produces the one, two, and neg signals. These signals are then exploited by the logic in Fig. 1b, along with the appropriate bits of the multiplicand, in order to generate the whole partial product array. Other alternatives for the implementation of the recoding and partial product generation can be found among others

Twos complement computation n=8

Application Multimedia and communication systems, Real-time signal processing like audio signal processing,video/image processing, or large-capacity data processingTools used:Simulation: ModelsimSynthesis: XilinxIse

Simulation Results

Partial product generation

Top Module

Device utilization summary:

RTL Schematic

Technology SchematicConclusion:

Twos complement n x n multipliers using radix- 4 modified booth encoding produce [N/2] partial product s but due to the sign handling , the partial product has a maximum height of [N/2]+1. we presented a scheme that produces a partial product array with maximum height of [N/2].