Upload
dr-andrea-nestl
View
359
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Collect garbage that has accumulated in the memory and makes your computer slow Your Benefits at a Glance Suitable for systems with hard real-time requirements (in particular embedded systems) Low garbage collection overhead No interruptions caused by garbage collection Extremely robust due to object-based memory protection Fully operational prototype (including a Java compiler)
Citation preview
© TLB GmbH, Karlsruhe 2012
A Novel RISC Processor Architecture For
Garbage Collection in Embedded Systems
© TLB GmbH, Karlsruhe 2012
Buffer Overflows are Responsible for a
Large Number of Today’s Security and Safety Problems
In a standard computer system, dynamically growing data
structures can overwrite unrelated data (buffer overflow)
The standard processor architecture lacks protection mechanisms
against buffer overflows
Buffer overflow errors are a common cause for critical security
vulnerabilities
© TLB GmbH, Karlsruhe 2012
Garbage Collection Helps Reduce Buffer Overflows,
But Causes High Overhead & Unpredictable Pauses
Automatic dynamic memory management automatically releases and compacts dynamically allocated memory after its last use
Such “garbage collection” reduces common error sources for buffer overflows
Existing garbage collection is mostly software-based, demands a high overhead and causes unpredictable pauses in the program execution
The limited resources of embedded systems typically do not allow for efficient garbage collection in real time
© TLB GmbH, Karlsruhe 2012
A Novel Approach Enables Parallel Garbage Collection
And Parallel Synchronization in Real-Time
The novel RISC processor architecture is optimized for security:
Strict separation of pointers from ordinary non-pointer data by using distinct register sets for pointers and data
The dedicated coprocessor performs the garbage collection:
The coprocessor uses an optimized Baker-style copying collector algorithm that runs in parallel to the main processor
A new garbage collection cycle is started by the coprocessor when the available memory falls below a chosen threshold
Simple hardware extensions to the processor pipeline support the synchronization between garbage collector and main processor
Key for the efficient implementation to avoid unbounded pauses
© TLB GmbH, Karlsruhe 2012
This Novel Approach Improves Performance
By Leaving The Cache Largely Unaffected
Software garbage collectors usually repeatedly
displace the entire contents of the cache
Examine the entire heap during a single cycle
The coprocessor directly connects to the memory controller
Does not access memory through the main processor’s cache
The cache remains largely unaffected by the garbage collection
The coprocessor ensures cache coherency
Inspects and selectively flushes single cache lines through a dedicated
cache port (resembles snoop port)
The coprocessor eliminates unnecessary memory traffic
Invalidates all cache lines with dead objects
at the end of a garbage collection cycle
© TLB GmbH, Karlsruhe 2012
A Fully Functional Prototype Exists
And Has Been Used For Performance Measurements
Main processor & GC Coprocessor modeled at register transfer level in VHDL, synthesized for Altera APEX 20K1000C (@ 25MHz)
Pipelined RISC processor, statically scheduled
up to 3 instructions per clock cycle (3-way multiple issue, “in order”)
16 pointer registers, 16 data registers, 8 Praedikatregister
8K execution cache, 8K data cache, 2K attribute cache
two-way set-associative copy-back cache
Micro-coded garbage collection coprocessor
256 x 80 bit on-chip microcode memory
Uses less than 20% of the chip surface area
Software
Native Java bytecode compiler developed for the architecture. An included code scheduler rearranges instructions to take advantage of the processor’s parallel execution units and to hide instruction latencies
Subset of the Java class libraries supporting text-based apps in order to facilitate the execution of representative programs (includes NFS client)
© TLB GmbH, Karlsruhe 2012
An Experimental Computer System Was Assembled
Based On The Garbage-Collection Processor
© TLB GmbH, Karlsruhe 2012
Pauses Caused By Garbage Collection Do
Not Exceed 500 Clock Cycles
Frequency distribution of synchronization pauses (shown for javac)
Pause Duration in Clock Cycles
Frequency
© TLB GmbH, Karlsruhe 2012
The Runtime Overhead For The Hardware-Based
Garbage Collection Is Small
© TLB GmbH, Karlsruhe 2012
The Advantages Of This Approach Could Enable
Real-Time Garbage Collection in Embedded Systems
Limits pauses from garbage collection to 500 clock cycles
Efficient synchronization
No code overhead
Low total runtime overhead of only a few percent
Undisturbed cache locality
Exact (non-conservative) garbage collection
Compiler & code are independent from garbage collector
© TLB GmbH, Karlsruhe 2012
BACKUP
© TLB GmbH, Karlsruhe 2012
Efficient Implementation
© TLB GmbH, Karlsruhe 2012
Coprocessor Architecture
© TLB GmbH, Karlsruhe 2012
Synchronization I
© TLB GmbH, Karlsruhe 2012
Synchronization II
© TLB GmbH, Karlsruhe 2012
Synchronization III
© TLB GmbH, Karlsruhe 2012
Synchronization IV
© TLB GmbH, Karlsruhe 2012
The Runtime Overhead Is Small - I
© TLB GmbH, Karlsruhe 2012
The Runtime Overhead Is Small - II
© TLB GmbH, Karlsruhe 2012
The Runtime Overhead Is Small - III