Image Compression System Megan Fuller and Ezzeldin Hamed 1
Slide 2
Transforms of Images Original Image Image Reconstructed from
25% of DFT coefficients Magnitude of DFT of Image-128 (otherwise DC
component = ~8e6) 2
Slide 3
The 2D Discrete Fourier Transform 3
Slide 4
The 2D Discrete Cosine Transform 4
Slide 5
High Level Architecture Separable, in- place 2D DFT/DCT Input
Memory Coefficient > Threshold? Output Module (sending data to
PC) 5 The choice between DFT and DCT is provided at compile time
Threshold is provided by the user at run time
Slide 6
Whats Interesting? Reducing the computation required Sharing
resources in the DCT case Some memory organization tricks Reducing
bit width 6
Slide 7
Number of FFTs 7
Slide 8
Reduction for the DFT case 8 S 00 S 01 S 02 S 03 S 10 S 11 S 12
S 13 S 20 S 21 S 22 S 23 S 30 S 31 S 32 S 33 N/2 FFTs of the rows,
followed by Even/Odd decomposition Output is symmetric (discard
half the columns) N/2 FFTs of the columns Total of N FFT
computations S 31 S 11 RealImag
Slide 9
Reduction in the DCT case Again combining the rows in the same
way as in DFT (N/2 FFTs) Even/Odd decomposition then extra
multiplication to calculate the DCT 9 S 10 S 11 S 12 S 13 S 00 S 01
S 02 S 03 S 30 S 31 S 32 S 33 S 20 S 21 S 22 S 23 Results are not
symmetric But the DCT is real We can combine the columns the same
way we combined the rows (N/2 FFT) The same multiplier inside the
FFT is used Another Even/Odd decomposition is required here with an
extra complex multiplier Total of N FFT computations + few extra
multiplications RealImag
Slide 10
In-Place Radix-4 FFT 10
Slide 11
Static Scaling Vs. Dynamic Scaling Shift when you expect an
overflow Shift after each addition The location of the fraction
point is fixed at each computation step Almost no overhead compared
to fixed point Higher effective bit width only in the first
computation steps No effect on the critical path 11 Shift only when
overflow occurs Track overflows and account for them The location
of the fraction point is the same for each 1D-FFT frame Needs
simple circuitry to track the overflow and shift when required
Effective bit width depend on the data. No effect on the critical
path
Slide 12
Design Space Explored Dynamic Scaling YesNo DFTDCTDFT 8 12 16
DCT 8 12 16 12 8 bits with dynamic scaling considered later 8 bits
without dynamic scaling (and 12 for DCT) perform too poorly to be
considered 12 does as good as 16 bits with dynamic scaling in the
DFT
Slide 13
Dynamic Scaling of DFT 13 50% of coefficients is sufficient for
perfect reconstruction because of the symmetry of the DFT 16 bits
without dynamic scaling does as well as floating point 12 bits with
dynamic scaling also does nearly as well as floating point
Slide 14
Dynamic Scaling of DFT(continued) 14 Improvement in performance
when dynamic scaling is used more than makes up for reduced
compression because the scaling bits have to be saved 12 bits with
dynamic scaling does nearly as well as 16 bits
Slide 15
DCT Vs. DFT 15 All cases are using dynamic scaling DCT provides
better energy compaction For DCT, 12 bits gives a lower MSE for a
given compression ratio (this was not the case for the DFT).
Slide 16
8 Bits Image reconstructed from 50% of the DFT coefficients,
computed with 8 bits, using dynamic scaling. MSE = 452. Image
reconstructed from 6% of the DFT coefficients, computed with 16
bits, MSE = 129. 16
Slide 17
Physical Considerations Transform# of BitsDynamic Scaling?
Critical PathSlice Registers Slice LUTs BRAMDSP48Es
DFT16No11.458ns16%23%29%7 DFT16Yes11.763ns17%24%29%7
DFT12No11.273ns15%22%24%7 DFT12Yes11.464ns16%23%24%7
DFT8Yes11.287ns15%22%18%6 DCT16Yes11.458ns19%26%29%10
DCT12Yes11.273ns18%25%24%10 DCT8Yes11.066ns17%23%18%8 17 Critical
path about the same for all designs, could probably be improved
with tighter synthesis constraints Resource usage increases with
bitwidth, addition of dynamic scaling, and DCT, but overall doesnt
change much DCT uses extra DSP blocks because of the extra
multiplication
Future Work Use of DRAM to allow compression of larger images
Support for color images Support for rectangular images of
arbitrary edge length Combining the DCT and DFT into a single core
that could compute either transform, as selected by the user at
runtime 19
Slide 20
Relationship Between the DFT and the DCT The N-point DFT of a
sequence is the Fourier Series coefficients for that sequence made
periodic with period N. 20
Slide 21
Relationship Between the DFT and the DCT (continued) The
N-point DCT of a sequence is a twiddle factor multiplied by the
first N Fourier Series coefficients of the 2N point sequence y(n)
made periodic with period 2N. y(n) = x(x) + x(2N-1-n) x(n) 21
Slide 22
Relationship Between the DFT and the DCT (continued) 22
Slide 23
Rounding DesignMSE Decrease with Rounding 12 bits, no dynamic
scaling, DFT20 16 bits, no dynamic scaling, DFT0 12 bits, dynamic
scaling, DFT2 16 bits, no dynamic scaling, DCT0 12 bits, dynamic
scaling, DCT2 16 bits, dynamic scaling, DCT0 Conclusion: Never
hurt, often helped. Free in hardware (just a register
initialization), so always use it. All subsequent results will be
using rounding. 23
Slide 24
Dynamic Scaling of DCT 24
Slide 25
Dynamic Scaling of DCT (continued) 25
Slide 26
Limitations of MSE Image reconstructed from 5.7% of the DCT
coefficients, computed with dynamic scaling. MSE = 193 Image
reconstructed from 6.1% of the DCT coefficients, computed without
dynamic scaling. MSE = 338 26
Slide 27
Performance of 8 Bit Systems 27
Slide 28
More Limitations of MSE (Left) 8 bit DFT coefficients, computed
with rounding. Compression ratio = 2.3, MSE = 869. (Right) 8 bit
DFT coefficients, computed without rounding. Compression ratio =
2.1, MSE = 664 (Left) 8 bit DCT coefficients, computed with
rounding. Compression ratio = 2.2, MSE = 517. (Right) 8 bit DCT
coefficients, computed without rounding. Compression ratio = 2.4,
MSE = 563 28