Introduction Integer motion estimation Fractional motion
estimation Parameterized power-scalable encoding system Flexible
system architecture Implementation results Conclusion 2
Slide 3
Power-aware encoder can adjust power consumption in response to
different conditions. ex: users preferences and battery states.
Battery capacity Lifetime Power-aware encoder 3
Slide 4
In this paper provide multiple operating configurations between
point C and D and thus can adapt to different environmental
conditions. Power-aware encoder 4
Slide 5
5
Slide 6
Integrates the low-power design techniques at the algorithm
level and the architecture level. Hardware-oriented fast algorithm
Improve data reuse capability. Content-aware algorithm Achieve good
tradeoff between coding performance and computation complexity.
6
Slide 7
Parallel-VBS-IME algorithm Computes all matching costs of
different block-sizes with the same MVs simultaneously.
Intra-candidate data reuse Computes 4x4 blocks first, larger block
sizes are calculated by summing up the corresponding 4x4 costs
immediately. Inter-candidate data reuse For two horizontally
neighboring candidates of a 1616 block, 1615 reference pixels are
overlapped and can be shared. 7
Slide 8
Parallel-VBS-FSS Good for inter- candidate data reuse.
Parallel-VBS-IME is adopted. Move to locally best Locally best is
at center 8
Slide 9
If motion activity is high Set more initial candidates to find
the accurate MVs. Multi-iteration parallel-VBS-FSS algorithm 6
initial candidates Search window Predicted motion window (PMW)
9
Slide 10
Six initial candidates (0,0) MV predictor Median MV of left,
up, and up-right blocks. Rest of four are used to find good
matching in complex motion region. 10
Slide 11
Content-adaptive strategy The PMW will be adaptively shrunk
according to the neighboring motion activity. 11
Slide 12
The searching candidate will conditionally move vertically or
horizontally. Flexible memory access to support efficient data
reuse. Rotate right one Rotate right two Rotate right three A2-D2
or B0-B3A2-D2 12
Slide 13
1. Reference and current frame 2. Current MB 2. Reference MBs
Two-directional random access 4. Compute the absolute difference
values 3. 16x16 5. Compute SAD Intra data reuse Inter data reuse
13
Slide 14
Advanced mode pre-decision algorithm N best modes (N = 0 7) are
pre-decided after IME with integer-pixel precision. Only the N best
modes are refined to quarter- pixel precision. Reduce computation.
Hardware-oriented one-pass algorithm The half-pixel and
quarter-pixel candidates are processed simultaneously to share the
memory access data and reduce 50% memory access. 14
Q is a 4 4 block of a quarter-pixel candidate and it is
bilinearly interpolated from two 4 4 blocks (A and B) of half-pixel
candidates. Data processing power for HT of all quarter- pixel
candidates is saved. 16
Slide 17
Drop 0.06dB Same memory access 17
Slide 18
Parallel Architecture Generate the half-pixel reference data
from integer- pixel reference data Generated the quarter- pixel
reference data from half-pixel reference data 18
Slide 19
Power-scalable parameters IME, FME, intra prediction (IP), and
DeBlocking(DB) engines. Flexibly control the power consumption of
the whole encoding system. 19
The curve shows the best coding performance with the highest
power consumption. 2.69% bit rate increase and 0.12 dB quality drop
in average. 23
Slide 24
Huangs H.264/AVC encoder Lins low-power MPEG-4 encoder Two
reference frames to 1 reference frame. Multi-iteration IME and FME
Power scalability of IP and DB. 24
Slide 25
25
Slide 26
A low-power and power-aware H.264/AVC video encoder has been
proposed. The power efficiency was co-optimized at the algorithm,
architecture, and circuit levels. Provide competitive power
efficiency under D1 (720480) 30 frames/s video encoding and the
best power configurations compared to the previous state-of-the-art
designs. 26