Upload
shae
View
30
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Hongtao Du. AICIP Research ECE Department University of Tennessee Feb 23, 2005. Background. Blind Source Separation (BSS) Motivation: “cocktail party problem” BSS Model: (Mixing) (Unmixing) BSS Algorithms ICA LCNN Pixel level processing. - PowerPoint PPT Presentation
Citation preview
1
Blind Source Separation
Synthesis Structures for
Hongtao Du
AICIP Research
ECE DepartmentUniversity of Tennessee
Feb 23, 2005
2
Background
• Blind Source Separation (BSS) Motivation: “cocktail party problem”• BSS Model:
(Mixing) (Unmixing)
• BSS Algorithms– ICA– LCNN
• Pixel level processing
WXS
nmnm
n
m x
x
ww
ww
s
s
1
1
1111
weight matrix or unmixing matrix
W
ASX the observed signal (pixel)X S the source signal (pure pixel or
noise) 1AW
3
Synthesis Structures
• Serial Processing– Processing pixel-by-pixel in a serial sequence
• Parallel Processing– Using SIMD structure– Multiple pixels in, multiple pixels out– Depending on hardware constraints
• Segment Processing– Pipeline structure– Parallel processing
4
Contrast Stretching
s, r : grey level of input pixel and output pixel
bmrrTs
5
Component Contrast
6
Component Contrast - RTL
7
Component Contrast - Schematic
8
Top-level - Schematic
9
Pre-layout Simulation
10
Pre-layout Simulation – Small Signal
11
Pre-layout Simulation – Reset
12
Pre-layout Simulation – Write Enable
13
Contrast Stretching (32-bit) – FPGA layout
14
Contrast Stretching (8-bit) – FPGA layout
15
Comparison 32-bit v.s. 8-bit• Device utilization summary:
– 32-bit– Number of External IOBs 132 out of 158 83%– Number of Occupied SLICEs 605 out of 12288 4%
– 8-bit– Number of External IOBs 36 out of 158 22%– Number of Occupied SLICEs 53 out of 12288 1%
• Clock Report
Constraint Requested Actual Frequency
TS_Clk = PERIOD TIMEGRP "Clk" 100 nS HIGH 50 nS 100.000 ns 14.464 ns 69.14MHz
Estimated Delay
16.63 ns32-bit
TS_Clk = PERIOD TIMEGRP "Clk" 100 nS HIGH 50 nS 100.000 ns 7.450 ns 134.23MHz11.33 ns8-bit
16
Parallel Contrast- Schematic
17
Pre-layout Simulation
18
Parallel Contrast Stretching – FPGA layout
19
Constraint
• Device utilization summary:– 32-bit
– Number of External IOBs 580 out of 158 367%– Number of Occupied SLICEs 4838 out of 12288 39%
– Too many required IOBs, exceeding the target FPGA capacity– 8-bit
– Number of External IOBs 148 out of 158 93%– Number of Occupied SLICEs 422 out of 12288 3%
• Clock Report
Constraint Requested Actual Frequency
TS_Clk = PERIOD TIMEGRP "Clk" 100 nS HIGH 50 nS 100.000 ns / /
Estimated Delay
16.63 ns32-bit
TS_Clk = PERIOD TIMEGRP "Clk" 100 nS HIGH 50 nS 100.000 ns 6.748 ns 148.19MHz11.63 ns8-bit
20
Pipeline Contrast- Schematic
21
Top-level - Schematic
22
Pre-layout Simulation
23
Pre-layout Simulation - threshold
24
Pipeline Contrast Stretching – FPGA layout
25
Synthesis PerformanceSynthesis Performance (8-bit) Device: Xilinx V1000EHQ-6
• Device utilization summary:– Number of External IOBs 156 out of 158 98%– Number of Occupied SLICEs 586 out of 12288 4%– Total equivalent gate count for design 13,474
• Clock Report
Constraint Requested Actual Frequency
TS_Clk = PERIOD TIMEGRP "Clk" 100 nS HIGH 50 nS 100.000 ns 20.944ns 47.75MHz
Estimated Delay
11.63 ns
26
Serial v.s. Parallel
Structure Requested Actual FrequencyEstimated Delay
Serial 100.000 ns 7.450 ns 134.23MHz11.33 ns
100.000 ns 20.944ns 47.75MHz11.63 ns
100.000 ns 6.748 ns 148.19MHz11.63 nsParallel
Pipeline
• Serial processing should have the minimum delay, but actually not.
• Parallel processing is the fastest structure• Pipeline is the most efficient structure, but very slow.
27
Serial Parallel
Pipeline