13
Laboratoire d' Intégration des Systèmes et des Technologies System-Level Hardware-Based Protection of Memories against Soft-Errors Valentin Gherman Samuel Evain Mickael Cartron Nathaniel Seymour Yannick Bonhomme

Laboratoire d' Intégration des Systèmes et des Technologies System-Level Hardware-Based Protection of Memories against Soft-Errors Valentin Gherman Samuel

Embed Size (px)

Citation preview

Page 1: Laboratoire d' Intégration des Systèmes et des Technologies System-Level Hardware-Based Protection of Memories against Soft-Errors Valentin Gherman Samuel

Laboratoire d' Intégration des Systèmes et des Technologies

System-Level Hardware-Based Protection of Memories against Soft-Errors

Valentin GhermanSamuel Evain

Mickael CartronNathaniel SeymourYannick Bonhomme

Page 2: Laboratoire d' Intégration des Systèmes et des Technologies System-Level Hardware-Based Protection of Memories against Soft-Errors Valentin Gherman Samuel

Laboratoire d' Intégration des Systèmes et des Technologies

New constraints– Increasing design & manufacturing costs– Decreasing time-to-market– Increasing reliability and yield problems of nanometer

technologies• Memory systems remain the most vulnerable

New requirements– Low-cost solutions

• Cross-domain & cross-application platform-based design

– Flexible solutions• Power, Performance, Reliability

Motivation

Page 3: Laboratoire d' Intégration des Systèmes et des Technologies System-Level Hardware-Based Protection of Memories against Soft-Errors Valentin Gherman Samuel

Laboratoire d' Intégration des Systèmes et des Technologies

Low Cost [1, 3]– Flexibility

– Standard interconnect & memory

Hardware-based [1, 2]– Concurrent error detection

– Transient & permanent faults

System-level [3]– Software-based

Related EDAC-based memory protection schemes

[1] GRLIB IP Core User’s Manual, Version 1.0.19, September 2008, pages 227, 248

[2] R. Mariani, G. Boschi, Solid-State Electronics 49, 2005

[3] P.P. Shirvani, N. Saxena, E.J. McCluskey, Transactions on Reliability, September 2000

[1, 2]

Processor Core

Interconnection

[3]

Standard Memory

Data Word

EDAC1 … EDACn

EDAC1 ... EDACm

Page 4: Laboratoire d' Intégration des Systèmes et des Technologies System-Level Hardware-Based Protection of Memories against Soft-Errors Valentin Gherman Samuel

Laboratoire d' Intégration des Systèmes et des Technologies

RSM

Low Cost [1, 3, RSM]– Flexibility– Standard interconnect &

memoryHardware-based [1, 2, RSM]– Concurrent error detection– Transient & permanent faults

System-level [3, RSM]

Reliability Service Manager (RSM)

[1] GRLIB IP Core User’s Manual, Version 1.0.19, September 2008, pages 227, 248

[2] R. Mariani, G. Boschi, Solid-State Electronics 49, 2005

[3] P.P. Shirvani, N. Saxena, E.J. McCluskey, Transactions on Reliability, September 2000

[1, 2]

Processor Core

Standard Memory

Data Word

EDAC1 … EDACn

EDAC1 ... EDACm

[3]

StandardInterconnection

Page 5: Laboratoire d' Intégration des Systèmes et des Technologies System-Level Hardware-Based Protection of Memories against Soft-Errors Valentin Gherman Samuel

Laboratoire d' Intégration des Systèmes et des Technologies

RSM: address calculation of EDAC codes

@EDAC = OffSet [ (Mask @DW) >> log2n ] @DW % n

– @DW = address of a protected data word (DW)

– OffSet, Mask are parameters

– n = DW width / EDAC code width

Standard Memory

Data Word (DW) 1

EDAC 1

OffSet

Check Words

Data Word (DW) n

…. EDAC n

EDAC code position in a memory

word

Page 6: Laboratoire d' Intégration des Systèmes et des Technologies System-Level Hardware-Based Protection of Memories against Soft-Errors Valentin Gherman Samuel

Laboratoire d' Intégration des Systèmes et des Technologies

RSM plugged on the bus arbiter (1)

Scales well with the number of masters in the system

Hide supplementary RSM-memory accesses to the arbiter

Processor (Master 1)Processor (Master 1)

Processor (Master n)Processor (Master n)

Main Memory(Slave)

Main Memory(Slave)

RSMRSM

Bus arbiterBus arbiter

Page 7: Laboratoire d' Intégration des Systèmes et des Technologies System-Level Hardware-Based Protection of Memories against Soft-Errors Valentin Gherman Samuel

Laboratoire d' Intégration des Systèmes et des Technologies

Interface RSM: AHB master AHB slave

AHB

Master

WCTRL

ADDR

WDATA

SWCTRL

SADDR

SWDATA

MWCTRL

MADDR

MWDATA

AHB

Arbiter

RSM

AHB

Slave

(Memory)

MU

X1

MU

X2

Data bits

Check bits

RSM plugged on the bus arbiter (2)

Page 8: Laboratoire d' Intégration des Systèmes et des Technologies System-Level Hardware-Based Protection of Memories against Soft-Errors Valentin Gherman Samuel

Laboratoire d' Intégration des Systèmes et des Technologies

Interface RSM: AHB slave AHB master

AHB

Master

AHB

Slave

(Memory)

AHB

Arbiter

SRCTRL

SRDATA

SRCTRL

SRDATAMRCTRL

MRDATA RCTRL

RDATAM

UX

3

RSM

RSM plugged on the bus arbiter (3)

Page 9: Laboratoire d' Intégration des Systèmes et des Technologies System-Level Hardware-Based Protection of Memories against Soft-Errors Valentin Gherman Samuel

Laboratoire d' Intégration des Systèmes et des Technologies

RSM Implementation

2.5 ns & 4154 NAND2 (130 nm HCMOS9)

Clock cycle overhead (MiBench benchmarks)

Processor without memory cache

0%

5%

10%

15%

20%

25%

30%

StringSearch FFT BasicMath

SEC-DED Parity

Processor with memory cache

0%

2%

4%

6%

8%

10%

StringSearch FFT BasicMath

SEC-DED Parity

Page 10: Laboratoire d' Intégration des Systèmes et des Technologies System-Level Hardware-Based Protection of Memories against Soft-Errors Valentin Gherman Samuel

Laboratoire d' Intégration des Systèmes et des Technologies

RSM associated to each master on the interconnection sub-system

Larger fault coverage of the interconnection sub-system

Better system performance

Interconnection

RSMRSM

Processor(Master 1)Processor(Master 1)

RSMRSM

Processor(Master n)Processor(Master n)

Main Memory(Slave)

Main Memory(Slave)

Page 11: Laboratoire d' Intégration des Systèmes et des Technologies System-Level Hardware-Based Protection of Memories against Soft-Errors Valentin Gherman Samuel

Laboratoire d' Intégration des Systèmes et des Technologies

RSM as a MMU wrapper

Physical address space

– Page-level granularity of the protected zones

Virtual address space

– Number of protection zones equal to the number of integrity levels

Protect MMU-generated memory accesses

MMURSM

ProcessorProcessor

Main MemoryMain Memory

Page 12: Laboratoire d' Intégration des Systèmes et des Technologies System-Level Hardware-Based Protection of Memories against Soft-Errors Valentin Gherman Samuel

Laboratoire d' Intégration des Systèmes et des Technologies

RSM with solid-state secondary storage sub-system

Transfers protected by RSM

– Blocks with protected data from memory to secondary storage

Unprotected mode transfers

– All blocks from secondary storage to secondary storage

– Blocks with unprotected data and checksums from memory to secondary

storage

Interconnection

RSMRSM

ProcessorProcessor

RSMRSM

DMADMA

Main MemoryMain MemorySecondary Storage

Secondary Storage

Page 13: Laboratoire d' Intégration des Systèmes et des Technologies System-Level Hardware-Based Protection of Memories against Soft-Errors Valentin Gherman Samuel

Laboratoire d' Intégration des Systèmes et des Technologies

Conclusions

Low cost – Flexible: cross-domain & cross-application

– Standard memory, storage & interconnections– Easy integration into the system (IP core)

• Small size (same size as an UART)

– No modification of application software– Low impact on system performance

Yield & Reliability

– Permanent & transient faultsProgrammability & Flexibility

– Size, location & integrity levels of protected zones– Programmable

• Offset & Mask parameters