41
Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive Cloud/Edge Chris Gwo Giun Lee, PhD Director, Bioinfotronics Research Center, Professor, Department of Electrical Engineering, National Cheng Kung University Tainan, Taiwan

Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive Cloud/Edge

Chris Gwo Giun Lee, PhD

Director, Bioinfotronics Research Center,

Professor, Department of Electrical Engineering,

National Cheng Kung University

Tainan, Taiwan

Page 2: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Outline

Introduction

Analytics Architecture: Abstraction at the System Level

Algorithm Architecture Co-Design Space Exploration

via Machine Learning

Algorithmic Intrinsic Complexity Metrics and Assessment

Intelligent Parallel/Reconfigurable Computing

Case studies

Multimedia: MPEG

Mobile Health: Reconfigurable CNN

2019/9/25 2Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

Page 3: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Introduction

2019/9/25 3Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

Page 4: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Vibrant & Fast Changing World

Before Industrial Revolution:

▪ Innovations in ENERGY and ELECTRICITY brought forth automation

▪ Revolutionary changes to traditional Artisan craftsmanship from the social, political, and economical perspectives.

Today:

▪ INFORMATION explosion resulting in BIG DAT

▪ McKinsey forecasted on changes by AI to be 10 times faster and 300 times larger in scale as compared to former Industrial Revolution!

Industry 1.0: Energy Industry 2.0: Electricity

Industry 3.0: Information Industry 4.0: AI

Page 5: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Apollo Navigation Computer Half a Century Ago

2019/9/25 5Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

“That's one small step for an engineer; one giant

leap for engineering.”

Page 6: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Reaching Out Even Further via IoT & Going in Ever Deeper.. Ubiquitous Computers

2019/9/25 Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab. 6

Page 7: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Ever More Complex Analytics Algorithms Should Run on Analytics Architecture

Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab. 7

Speech recognition with feelings

Analytics Algorithm:

Analyzes speech & images

Facial emotion detection

Analytics Architecture:

Analyzes dataflow

2019/9/25

Page 8: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Algorithm/Architecture Co-Design: Analytics Architecture for SMART SoC

2019/9/25 8Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

Page 9: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

New Design Paradigm: Moving from programming to design and beyond…

Lee from NCKU (2007):

Design = Algorithm + Architecture

Wirth from ETHZ (1975):

Programming = Algorithm + Data Structure

2019/9/25 9Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

Page 10: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Architectural Platforms Beyond Cloud…Post Moore’s law

2019/9/25 Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab. 10

Performance/Area

Flexibility

PowerASIC

Embedded Reconfigurable Logic/ FPGA

Application Specific Instruction Set Processor (ASIP)

Instruction Set DSP

Reconfigurable Processor/FPGA

Embedded general purpose instruction set processor

Embedded multicore processor/GPU

Towards Cloud &

Heterogeneous Systems

Neuromorphic

Device

Page 11: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Human Brain: THE Most Power Efficient Intelligence

2019/9/25 Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab. 11

AlphaGo 13 Layer CNN Human is 300x more power

efficient

https://www.slideshare.net/ShaneSeungwhanMoon/h ow-alphago-works

Page 12: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Design Space w/Different Levels of Abstraction

Application/Specification

Algorithm

Architecture

Cycle-accurate

model

Synthesized

netlist

Physical

design

Device

level

Different instances or realizations

Application

System

Architecture

Microarchitecture

Circuit

Device

Explore

Algorithm/Architecture Co-exploration

Software/Hardware Co-exploration

Circuit/Device Co-exploration

Neuromorphic

Levels Symbols Features Time units Modeling Tool

AlgorithmSystem functionality Seconds (sec) R, Matlab, Python,

C/C++

Architecture

System architecture:

No. of operations, data

transfer rate, data storage,

degree of parallelism

No. of cycles SystemC, CAL,

DIF, LIDE

IP(Macro)IP functionality No. of cycles Verilog, VHDL

ModuleArithmetic/Logic

operation

Cycle Verilog, VHDL

GateLogic operation, Timing

Delay

Nanosecond

(ns)

Verilog, VHDL

Circuit

Voltage, current,

resistance, capacitance,

inductance

Picosecond

(ps)

SPICE

DeviceElectron Picosecond

(ps)

PyNN, PyNCS,

Corelet

Motion

estimator

ALU

2019/9/25 Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab. 12

Traditionally Software/Hardware Co-design

Current Algorithm/Architecture Co-exploration for yet larger systems but

how?

Page 13: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Algorithm/Architecture Co-Design:Cross Level Abstraction at the System Level

Algorithm Design

Architecture Design

Traditional design flowAlgorithm/Architecture Co-Exploration

Algorithm Design

Architecture Design

Complexity MeasurementNo. of operationsDegree of parallelismData transfer rateData storage requirements

Back AnnotationData Flow

G. G. (Chris) Lee, Y.-K. Chen, M. Mattavelli, and E. S. Jang, “Algorithm/Architecture Co-Exploration of Visual Computing: Overview and Future

Perspectives,” IEEE Trans. on Circuits and Systems for Video Technology. Vol. 19, Iss. 11, pp. 1576-1587, Nov. 2009.

2019/9/25 13Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

• Know software/hardware ingredients in early design phase hence from top level

• Extract complexity features from dataflow graph models

Page 14: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Exploring Algorithm/Architecture Co-Design Space Via Machine Learning/Spectral Graph Theory

Dataflow modeling

at different abstraction levels

Characterization of

algorithmic complexity

Architectural

information

Algorithmic

complexity

Mapping

Computing

algorithms

Software/Hardware/Neuromorphic

Partition

Algorithms

Complexity

Architectural information:

Parallel/Reconfigurable

Dataflow model

Algorithmic

Space

Algorithm/Architecture

Co-exploration Space

(Spectral Graph Theory)

Complexity

Back annotation

Delay model, energy

consumption, circuit

reliability

Applications IoT/Mobile Comunication/5G Multimedia Biomedical

Software/Hardware

Co-exploration

Circuit/Device

Co-exploration

Programmable

Parallel CPU/ GPU

Design

Reconfigurable Logic

design

Analog-NVM

Neuromorphic

Accelerator

MOS Transistor

Design using

TCAD/Quantum

Simulator

Memory Device

Design using

TCAD/Quantum

Simulator

Analytic algorithm with Deep

Learning for different applications

Device capacitances,

Memory retention,

endurance, read/write

latency & current

Customized

LogicBack

annotation

2019/9/25 Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab. 14

Page 15: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

MODELING the COMPUTATIONAL PLATFORM

via

Dataflow Graphs (DFG)

2019/9/25 15Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

Page 16: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Dataflow…

2019/9/25 16Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

Page 17: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Dataflow Graph Modelling Computational Platform @ Various Data Granularity

They should contain:

Algorithmic information or behavior

Architectural information (Software/Hardware) for implementation

Some important dataflow models are:

Directed acyclic graph (DAG)

Synchronous dataflow (SDF) graph

Control data flow graph (CDFG)

Kahn process networks (KPN)

Y-chart application programming interface (YAPI)

2019/9/25 17Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

Page 18: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Algorithm/Architecture Co-DesignSpace Exploration

Via

Machine Learning

2019/9/25 18Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

Page 19: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

How Big is Big?

Algorithmic Intrinsic Complexity Metrices/Features

PLATFORM INDEPENDENCE

2019/9/25 19Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

Page 20: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Number of Operations

Estimates the number of each type of operations

Addition/Subtraction

Multiplication

Division

Shift

Logic operations

Operations with constant input and variable input should be

differentiated to provide high accuracy

X + Y vs. X + 5 (X and Y are variables)

X×Y vs. X×5 (X and Y are variables )

In addition, the precision of each operand should be taken into

account, since it can significantly influence complexity

2019/9/25 20Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

Page 21: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

SMART TRANSFORM PAIRvia

Spectral Graph Theory (SGT)for

Intelligent Parallel/Reconfigurable Computing

2019/9/25 21Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

Page 22: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Parallel Computing (Forward Transform):Efficient & Flexible Cognitive Cloud

2019/9/25 Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab. 22

• Using SGT as machine learning in exploring the AAC space:

▪ Connected component are eigen-decomposed where spectrum of

unconnected graph components serves as information or features extracted

▪ decision making performed via the bi-partite or k-partitioning based on

principle axis theorem optimized for data independency

(d) Stream Processor within Nvidia GPU(b)

(a) (c) Retargetable Compiler

Classical

Compiler

Source

Code

ASM code

Retargetable

Compiler

Source

Code

ASM code

Processor

Model

Page 23: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Spectral Graph Theory

i j1 if vertex and vertix are adjacent to each other(i,j)

0 otherwise

=

A

0 1 0

1 0 1

0 1 0

=

A

Adjacency matrix A

1 2 3Graph

i j

(i,j) if i j

(i,j) -1 if i j and vertex is adjacent to vertix

0 otherwise

=

=

D

L

Laplacian matrix L=D-A, where D is a diagonal matrix where the diagonal

elements represents the number of edges connected to that node.

1 1 0

1 2 1

0 1 1

− = − − −

L

2019/9/25 23Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

• Gwo Giun Lee, He-Yuan Lin, Chun-Fu Chen, Tsung-Yuan Huang, "Quantifying Intrinsic Parallelism Using Linear Algebra for

Algorithm/Architecture Co-Exploration," IEEE Transactions on Parallel and Distributed Systems, vol. 23, iss. 5, pp. 944-957, May 2012

• Gwo-Giun Lee, He-Yuan Lin, "Method of analyzing intrinsic parallelism of algorithm," USA, Patent No. US8522224 B2, Aug. 27, 2013.

• Gwo-Giun Lee, Ming-Jiun Wang, He-Yuan Lin, "Method and Algorithm Analyzer for Determining a Design Framework," USA, Patent No.

US8621414 B2, Dec. 31, 2013.

• (Boston, MA, June 1, 2015, GLOBE NEWSWIRE)

Page 24: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Degree of Parallelism: Eigen-Analysis of DFGs using SGT

1

0

0

0

1

1

1 0 0 0 0 1

0 1 0 0 0 1

0 0 1 0 1 0

0 0 0 1 1 0

0 0 1 1 2 0

1 1 0 0 0 2

− −

− − − − −

+ +

+

A1 B1 C1 D1

O1

+ +

+

A2 B2 C2 D2

O2

O1=A1+B1+C1+D1

O2=A2+B2+C2+D2

1 2

6

43

5

Algorithm Dataflow diagram

Causation graphLaplacian matrix

Spectrum

0, 0, 1, 1, 3, 3Parallelism

2

0

1

1

1

0

0

0

0

1

1

0

0

0

0

0

0

1

1

0

2

1

1

0

0

2

0

0

0

1

1

Eigenvalue:

Eigenvector:

×

(Homogeneous)

2019/9/25 24Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

Quantification of parallelization,

Instruction Set Architecture (ISA) design

Page 25: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Reconfigurable Computing (Inverse Transform): Efficient & Flexible Mobile Edge

2019/9/25 Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab. 25

• Commonalities are analyzed on DFGs for reuse when synthesizing or

reconfiguring the CNN computational platform.

▪ Introduce efficient flexible architecture with algorithmic convolution for

CNN are eigen-transformed to matrix operations with higher symmetry

(d) Reconfigurable Architecture of NVDLA

(a)

(b)

(c) Reconfigurable architecture

Reconfigurable

Datapath

Microcode 1

Microcode

Memory

Microcode 2

Microcode

Memory

Reload

Microcode

Reconfigurable

Datapath

Microsequencer Microsequencer

Reconfigurable

Control

Reconfigurable

Control

Page 26: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Reconfigurable Architecture: Commonality Extraction from DFGs

2019/9/25 Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab. 26

AVC/H.264 filter coefficients

[1,-5,20,20,-5,1]

MPEG-4 filter coefficients

[-8,24,-48,160,160,-48,24,-8]

[-1,3,-6,20,20,-6,3,-1]Divide by 8

MPEG-4 Chroma interpolation, MPEG2,

and AVC/H.264 chroma prediction

[1 1]

[1 1 1 1]

+x[0]

x[5]

x[1]

x[4]

x[2]x[3]

+

+

2 +

2 +

4

- +

+

result

i : left shift by i i : right shift by i + : addition

+x[0]

x[7]

x[1]

x[6]

x[2]

x[5]

+

+

1 +

2 +

1

-3+ result

x[3]

x[4]+ 2 +

4

+

+

+x[0]

x[1]

x[2]

x[3]+

+ resultx[0]

x[1]+ result

• Observe the common parts between each dataflow graph.

Page 27: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

+

+

+

x[1]

x[2]

x[3]

x[0]

result+

+

+

result

Module 0

Module 1

Module 2

Module 3

Module 4

x[1]

x[6]

x[2]

x[5]

x[3]

x[4]

x[0]

x[7]

x[5]

0

x[1]

x[4]

x[2]

x[3]

x[0]

0

x[1]

0

x[2]

0

x[3]

0

x[0]

x[1]

+

+

+

result

x[0]

x[7]+

3

+ 1 +x[1]

x[6]

x[2]

x[5]+ 2 +

1

-

x[3]

x[4]+ 2 +

4

+x[1]

x[0]

result

Reconfigurable fractional interpolation

[-8,24,-48,160,160,-48,24,-8]

+ 1 +

+

+ 2 +

4

3Module 4

[1,-5,20,20,-5,1]

+ 2 +

1

- + 2 + -

+ 2 +

4

[1 1 1 1] [1 1]

+

Coefficients

Module 3

Module 2

Module 1

Module 0

+

+

+

result

+ 2 +

1

-

x[5]

x[1]

x[4]

x[2]

x[3]

x[0]

+ 2 +

4

2019/9/25 Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab. 27

Page 28: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Performed PCA to extract the commonality in Gabor filter.

The transformed Gabor filter bank has four symmetry patterns. Their coefficient are illustrated in the following:

2019/9/25 Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab. 28

c0 c1 c2 c3 c2 c1 c0

c1 c4 c5 c6 c5 c4 c1

c2 c5 c7 c8 c7 c5 c2

c3 c6 c8 c9 c8 c6 c3

c2 c5 c7 c8 c7 c5 c2

c1 c4 c5 c6 c5 c4 c1

c0 c1 c2 c3 c2 c1 c0

0 c1 c2 c3 c2 c1 0

-c1 0 c5 c6 c5 0 -c1

-c2 -c5 0 c8 0 -c5 -c2

-c3 -c6 -c8 0 -c8 -c6 -c3

-c2 -c5 0 c8 0 -c5 -c2

-c1 0 c5 c6 c5 0 -c1

0 c1 c2 c3 c2 c1 0

c0 c1 c2 0 -c2 -c1 -c0

c1 c4 c5 0 -c5 -c4 -c1

c2 c5 c7 0 -c7 -c5 -c2

0 0 0 0 0 0 0

-c2 -c5 -c7 0 c7 c5 c2

-c1 -c4 -c5 0 c5 c4 c1

-c0 -c1 -c2 0 c2 c1 c0

0 c1 c2 0 -c2 -c1 0

-c1 0 c5 0 -c5 0 c1

-c2 -c5 0 0 0 c5 c2

0 0 0 0 0 0 0

c2 c5 0 0 0 -c5 -c2

c1 0 -c5 0 c5 0 -c1

0 -c1 -c2 0 c2 c1 0

Pattern 1: Pattern 2: Pattern 3: Pattern 4:

: Repeated Coefficient

: Coefficient 0 : Zero Coefficient

Coefficient:

[c0, c1, c2, c3, c4, c5, c6, c7, c8, c9]

Coefficient:

[c1, c2, c3, c5, c6, c8]

Coefficient:

[c0, c1, c2, c4, c5, c7]

Coefficient:

[c1, c2, c5]

Four Symmetrical Patterns After PCA

w0

w12

w2

w13

w5

w9

w1 w4 w8 w3 w7 w11

w15

w6 w10 w14

Page 29: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Module

1

Module

2Module

3

Module

0

x[0]

x[48]

……

……

……

……

……

..

x[1]

x[47]x[2]

x[46]x[3]

x[45]

x[25]

x[23]

x[24]

+

x[0]

x[48]

……

……

……

……

……

..

x[1]

x[47]x[2]

x[46]x[3]

x[45]

x[25]

x[23]

x[24]

x[0]

x[48]

……

……

……

……

……

..

x[1]

x[47]x[2]

x[46]x[3]

x[45]

x[25]

x[23]

x[24]

x[0]

x[48]

……

……

……

……

……

..

x[1]

x[47]x[2]

x[46]x[3]

x[45]

x[25]

x[23]

x[24]

k5

k1

×

×

×

×

×

×

×

×

×

×

c0f0

f1

f2

f3

f4

f5

f6

f7

f8

f9

f10

f11

f12

f13

f14

f15

e0

e1

e2

e3

e4

e6

e5

e7

e8

e10

e9

e11

e12

e14

e13

e15

e16

e18

e17

e19

e20

e21

e22

e23

e24

x[0]

x[48]+

……

……

……

……

……

..

x[1]

x[47]+

x[2]

x[46]+

x[3]

x[45]+

x[25]+

x[23]

x[24] +

……

……

……

……

……

..

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

k0

k2

k3

k4

k6

k7

k8

k9

+

c1

c2

c3

c4

c5

c6

c7

c8

c9

k5

k1 ×

×

×

×

×

×

f1

f2

f3

f4

f6

f7

f8

f9

f11

f12

f13

f14

e1

e2

e3

e4

e5

e7

e10

e9

e11

e14

e13

e15

e17

e19

e20

e21

e22

e23

……

……

……

……

……

..

x[1]

x[47]+

x[2]

x[46]+

x[3]

x[45]+

x[25]+

x[23]

……

……

……

……

……

..

+

+

+

+

+

+

+

+

+

+

+

+

k2

k3

k6

k8

+

c1

c2

c3

c5

c6

c8

k5

k1

×

×

×

×

×

×

c0f0

f1

f2

f4

f5

f6

f8

f9

f10

e0

e1

e2

e3

e6

e5

e7

e8

e9

e11

e12

e14

e13

e15

e16

e18

e19

e20

x[0]

x[48]+

……

……

……

……

……

..

x[1]

x[47]+

x[2]

x[46]+

x[3]

x[45]+

……

……

……

……

……

..

+

+

+

+

+

+

+

+

+

+

+

+

k0

k2

k4

k7

+

c1

c2

c4

c5

c7

+x[20]

x[28]

Reconfigurable Transformed Gabor Filter Bank

Coefficients Module 3Module 2Module 1Module 0

c0 c1 c2 c3 c2 c1 c0

c1 c4 c5 c6 c5 c4 c1

c2 c5 c7 c8 c7 c5 c2

c3 c6 c8 c9 c8 c6 c3

c2 c5 c7 c8 c7 c5 c2

c1 c4 c5 c6 c5 c4 c1

c0 c1 c2 c3 c2 c1 c0

0 c1 c2 c3 c2 c1 0

-c1 0 c5 c6 c5 0 -c1

-c2 -c5 0 c8 0 -c5 -c2

-c3 -c6 -c8 0 -c8 -c6 -c3

-c2 -c5 0 c8 0 -c5 -c2

-c1 0 c5 c6 c5 0 -c1

0 c1 c2 c3 c2 c1 0

c0 c1 c2 0 -c2 -c1 -c0

c1 c4 c5 0 -c5 -c4 -c1

c2 c5 c7 0 -c7 -c5 -c2

0 0 0 0 0 0 0

-c2 -c5 -c7 0 c7 c5 c2

-c1 -c4 -c5 0 c5 c4 c1

-c0 -c1 -c2 0 c2 c1 c0

0 c1 c2 0 -c2 -c1 0

-c1 0 c5 0 -c5 0 c1

-c2 -c5 0 0 0 c5 c2

0 0 0 0 0 0 0

c2 c5 0 0 0 -c5 -c2

c1 0 -c5 0 c5 0 -c1

0 -c1 -c2 0 c2 c1 0

Pattern 1

[c0, c1, c2, c3, c4,

c5, c6, c7, c8, c9]

Even

Vertical

Symmetry

Even

Diagonal

Symmetry

Multiplying

Coefficients

Pattern 2

[c1, c2, c3, c5, c6, c8]Even

Vertical

Symmetry

Odd

Diagonal

Symmetry

Pattern 3

[c0, c1, c2, c4, c5, c7]Odd

Vertical

Symmetry

Even

Diagonal

Symmetry

Odd

Vertical

Symmetry

Odd

Diagonal

Symmetry

Pattern 4

[c1, c2, c5]Even

Point

Symmetry

k5

k1 ×

×

×

f1

f2

f4

f6

f8

f9

e1

e2

e3

e6

e5

e7

e9

e11

e14

e13

e15

e19

e20

……

……

……

……

……

..

x[1]

x[47]+

x[2]

x[46]+

x[3]

x[45]+

……

……

……

……

……

..

+

+

+

+

+

+

+

+

+

k2

+

c1

c2

c5

x[20]

x[28] +: multiplication

+ : addition

×

: multiplied by -1

Page 30: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

2019/9/25 Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab. 30

A Very Useful SMART Sensor System

SMARTLET: SMART

toiLET

SMART is a BUZZ word

that SELLS

Page 31: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

AAC Study Cases:

Observe and Learn from Nature in Engineering Innovations.

2019/9/25 31Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

Page 32: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Multimedia

Page 33: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

[3] Y. Murachi, K. Hamano, T. Matsuno, J. Miyakoshi, M. Miyama, and M. Yoshimoto, “A 95 mW MPEG2 MP@HL motion estimation processor core for portable high-resolution video

application,” IEICE Trans. Fund. Electron. Commun. Comput. Sci., vol. E88-A, pp. 3492–3499, Dec. 2005.

Algorithm/architecture co-design of

spatial-temporal recursive motion estimator

Terms MHS[3] Our

PSNR 35.41 dB 35.45 dB

Application 1920x1080 @ 30 FPS 1920x1080 @ 30 FPS

Search range H: ± 128, V: ± 64 H: ± 128, V: ± 64

Technology 0.18 mm 0.18 mm

Clock rate 108 MHz 81 MHz

Total cell area 562.5 K gates 51.2 K gates

Picture n-1 Picture nPicture n-2

MVn-1

• Spatial-temporal recursive ME

• Architecture comparison

[1] Y. Nie and K. Ma, “Adaptive irregular pattern search with matching prejudgment for fast block-matching motion Estimation,” IEEE Trans. Circuits Syst. Video Technol., vol. 15, no.

6, pp. 789–794, Jun. 2005.

[2] C. Zhu, X. Lin, L. Chau, and L. Po, “Enhanced hexagonal search for fast block motion estimation,” IEEE. Trans. Circuits Syst. Video Technol., vol. 14, no. 10, pp. 1210–1214, Oct. 2004.

H: horizontal; V: vertical

─ Initial candidates from spatial and temporal

references followed by local search

➔ blocks cannot be processed in parallel

33.80

33.82

33.84

33.86

33.88

33.90

10 15 20 25 30 35 40

PS

NR

(d

B)

# of search points

Assumption: 16 processing elements performing accumulation of

absolute difference with utilization 75%

# of search points 10 15 20 25 30 35 40

Clock rate (MHz) 54 81 108 135 162 189 216

• Complexity vs. performance

35.63

35.28 35.32 35.41 35.45

34.80

35.00

35.20

35.40

35.60

35.80

36.00

FS AIPS[1] EHS[2] MHS[3] Our

PS

NR

(d

B)

ME algorithms

PSNR Comparision

• Performance comparison

Motion trajectory

Page 34: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Mobile Health: Deep Learning

Self Organizing Cerebral Organoids by Madeline Lancaster

Page 35: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Convolutional Neural Network (CNN)

Convolution layers for feature extraction & fully connected network asclassifiers

Feature layers updated or information mined from large amount of data via supervised learning

Bayesian learning and Linsker’s self-organizing Kohonen feature map

Convolution/feature layers constitute multiresolution pyramid like AzrielRosenfeld?

Convolution layer Sub-sampling/

Pooling layer Convolution layer Sub-sampling/

Pooling layerFully connected

network

Input layer 3 feature maps 3 sub-sampled

feature maps

6 sub-sampled

feature maps

6 sub-sub-sampled

feature maps Normal layer Output layer

2019/9/25 35Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

Page 36: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Third Harmonically Generated Melasma Images with different Dendricity Levels as described by

Medical Experts

2019/9/25 Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab. 36

Normal Image Dendritic Image

More Dendritic Images Most Dendritic Image

Page 37: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Gabor Features to Characterize Dendricity Directions

2019/9/25 Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab. 37

Original Image

Combination Results

Direction = 0° Direction = 45°

Direction = 90° Direction = 135°

Page 38: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Gabor Features and PCA Transformed Filters with Higher Symmetry

2019/9/25 Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab. 38

(a) Gabor filter with parameters: σx = σy = 2.46 / π, ω = π / 2, and θ = 0.

(b) Gabor filter with parameters: σx = σy = 2.46 / π, ω = π / 2, and θ = π / 2.

(c) Coefficients distribution for two Gabor filters.

(d) Transformed filter for first Gabor filter.

(e) Transformed filter for second Gabor filter.

(f) Coefficients distribution for two transformed filters.

Page 39: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Comparison for Gabor Filter Bank and

Transformed Gabor Filter Bank

⚫We perform the Gabor filter bank consisted of 16 Gabor filters over the 512×512

image with zero-padding.

Dataflow ModelGabor Filter

Bank

Transformed

Gabor Filter

Bank

Remark

Number of

operations

Addition 201,326,592 72,351,744

Multiplication 205,520,896 60,817,408

Storage requirement (bits) 139,776 119,808

Data transfer (bits) 142,606,336 142,606,336Estimated for total or average but peak data transfer should

drop.

Degree of

parallelism16 1

Implementation of Gabor filter bank is performing convolution with 16

Gabor filters sequentially, since we want the same starting point to

compare.

Execution

time

(sec)

Intel Core i7-5820k 1.85 0.19

Intel Core i7-3770 2.51 0.27

Intel Core i5-3550 2.01 0.21

Intel Core i7-930 2.78 0.33

2019/9/25 Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab. 39

Page 40: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive

Conclusion

Algorithm and Architecture needs to be looked at

together

Provides flexible, high accuracy, high efficiency, low

power, and LOW COST designs

Methodology was adopted by industry in deploying

50+ million units of LCD Panels worldwide

Cross level of abstraction framework which

systematically models computational (5+G) platforms

to solve cross disciplinary problems in SMART

manners for another half a century (?)

2019/9/25 40Bioinfotronics Research Center Prof. Gwo Giun Lee/NCKUEE Media SoC Lab.

Page 41: Algorithm/Architecture Co-design for Smart Signals and ...site.ieee.org/scv-cas/files/2019/09/2019Lee.pdf · Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive