26
Algorithm and Architecture Design of Power-Oriented H.264/AVC Baseline Profile Encoder for Portable Devices Yu-Han Chen, Tung-Chien Chen, Chuan-Yung Tsai, Sung-Fang Tsai, and Liang-Gee Chen, Fellow, IEEE IEEE CSVT 2009 1

Yu-Han Chen, Tung-Chien Chen, Chuan-Yung Tsai, Sung-Fang Tsai, and Liang-Gee Chen, Fellow, IEEE IEEE CSVT 2009 1

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

  • Slide 1
  • Yu-Han Chen, Tung-Chien Chen, Chuan-Yung Tsai, Sung-Fang Tsai, and Liang-Gee Chen, Fellow, IEEE IEEE CSVT 2009 1
  • Slide 2
  • Introduction Integer motion estimation Fractional motion estimation Parameterized power-scalable encoding system Flexible system architecture Implementation results Conclusion 2
  • Slide 3
  • Power-aware encoder can adjust power consumption in response to different conditions. ex: users preferences and battery states. Battery capacity Lifetime Power-aware encoder 3
  • Slide 4
  • In this paper provide multiple operating configurations between point C and D and thus can adapt to different environmental conditions. Power-aware encoder 4
  • Slide 5
  • 5
  • Slide 6
  • Integrates the low-power design techniques at the algorithm level and the architecture level. Hardware-oriented fast algorithm Improve data reuse capability. Content-aware algorithm Achieve good tradeoff between coding performance and computation complexity. 6
  • Slide 7
  • Parallel-VBS-IME algorithm Computes all matching costs of different block-sizes with the same MVs simultaneously. Intra-candidate data reuse Computes 4x4 blocks first, larger block sizes are calculated by summing up the corresponding 4x4 costs immediately. Inter-candidate data reuse For two horizontally neighboring candidates of a 1616 block, 1615 reference pixels are overlapped and can be shared. 7
  • Slide 8
  • Parallel-VBS-FSS Good for inter- candidate data reuse. Parallel-VBS-IME is adopted. Move to locally best Locally best is at center 8
  • Slide 9
  • If motion activity is high Set more initial candidates to find the accurate MVs. Multi-iteration parallel-VBS-FSS algorithm 6 initial candidates Search window Predicted motion window (PMW) 9
  • Slide 10
  • Six initial candidates (0,0) MV predictor Median MV of left, up, and up-right blocks. Rest of four are used to find good matching in complex motion region. 10
  • Slide 11
  • Content-adaptive strategy The PMW will be adaptively shrunk according to the neighboring motion activity. 11
  • Slide 12
  • The searching candidate will conditionally move vertically or horizontally. Flexible memory access to support efficient data reuse. Rotate right one Rotate right two Rotate right three A2-D2 or B0-B3A2-D2 12
  • Slide 13
  • 1. Reference and current frame 2. Current MB 2. Reference MBs Two-directional random access 4. Compute the absolute difference values 3. 16x16 5. Compute SAD Intra data reuse Inter data reuse 13
  • Slide 14
  • Advanced mode pre-decision algorithm N best modes (N = 0 7) are pre-decided after IME with integer-pixel precision. Only the N best modes are refined to quarter- pixel precision. Reduce computation. Hardware-oriented one-pass algorithm The half-pixel and quarter-pixel candidates are processed simultaneously to share the memory access data and reduce 50% memory access. 14
  • Slide 15
  • Hardware-oriented one-pass algorithm Two-step algorithm:17One-pass algorithm:25 Integer-pixel Half-pixel Quarter-pixel 15
  • Slide 16
  • Q is a 4 4 block of a quarter-pixel candidate and it is bilinearly interpolated from two 4 4 blocks (A and B) of half-pixel candidates. Data processing power for HT of all quarter- pixel candidates is saved. 16
  • Slide 17
  • Drop 0.06dB Same memory access 17
  • Slide 18
  • Parallel Architecture Generate the half-pixel reference data from integer- pixel reference data Generated the quarter- pixel reference data from half-pixel reference data 18
  • Slide 19
  • Power-scalable parameters IME, FME, intra prediction (IP), and DeBlocking(DB) engines. Flexibly control the power consumption of the whole encoding system. 19
  • Slide 20
  • (1) 4 (2) 4 (3) 2+2 (4) 2 Power modes: 4*4*4*2=128 20
  • Slide 21
  • 21
  • Slide 22
  • 22
  • Slide 23
  • The curve shows the best coding performance with the highest power consumption. 2.69% bit rate increase and 0.12 dB quality drop in average. 23
  • Slide 24
  • Huangs H.264/AVC encoder Lins low-power MPEG-4 encoder Two reference frames to 1 reference frame. Multi-iteration IME and FME Power scalability of IP and DB. 24
  • Slide 25
  • 25
  • Slide 26
  • A low-power and power-aware H.264/AVC video encoder has been proposed. The power efficiency was co-optimized at the algorithm, architecture, and circuit levels. Provide competitive power efficiency under D1 (720480) 30 frames/s video encoding and the best power configurations compared to the previous state-of-the-art designs. 26