Upload
heiko-joerg-schick
View
667
Download
0
Embed Size (px)
Citation preview
Run-Time Reconfiguration forHyperTransport coupled FPGAs using
ACCFSJochen Strunk, Andreas Heinig, Toni Volkmer,
Wolfgang Rehm, Heiko Schick
Chemnitz University of TechnologyComputer Architecture Group
Prof. Wolfgang Rehm
WHTRA 2009, Heidelberg / February 12th 2009
Outline
1 Introduction / Goals
2 Run-Time Reconfiguration on FPGAs
3 HyperTransport Cave with Run-Time Reconfiguration Support
4 The Accelerator File System (ACCFS) as Software Framework
5 Case Study - two RTRMs
6 Conclusion
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 2/23
Outline
1 Introduction / Goals
2 Run-Time Reconfiguration on FPGAs
3 HyperTransport Cave with Run-Time Reconfiguration Support
4 The Accelerator File System (ACCFS) as Software Framework
5 Case Study - two RTRMs
6 Conclusion
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 3/23
Introduction
FPGAs are used as accelerators in host coupled systems.
Hot-plug functionality of plug-in-cards are not supported bymost motherboards, BIOS’s, operating systems.
For continuous host link connectivity todays plug-in-cardswith FPGAs need a second chip which handles hostcommunication, most common: a second FPGA.
Uploading further accelerator modules / compute kernels isnot possible during run-time although sufficient space wouldbe available on the FPGA.
Run-time reconfiguration (RTR)
⇒ On the other side FPGAs offer run-time reconfiguration support(DPR capable FPGAs, e.g. Xilinx Virtex, -2, -4, -5, -6).
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 4/23
Goals
Provide a solution for host coupled FPGAs
where only one FPGA is needed in a host coupled system,i.e. no additional chip for host communication
which allows to change the functionality during run-timee.g. for uploading further compute kernels
where a software framework does exist for user applications,which allows easy handling without restricting the possibilitiesof run-time reconfigurable FPGAs
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 5/23
Outline
1 Introduction / Goals
2 Run-Time Reconfiguration on FPGAs
3 HyperTransport Cave with Run-Time Reconfiguration Support
4 The Accelerator File System (ACCFS) as Software Framework
5 Case Study - two RTRMs
6 Conclusion
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 6/23
Run-Time Reconfiguration on FPGAs
Dynamic partial reconfiguration (DPR) is available on XilinxVirtex,-2,-4,-5,-6 FPGAs.
The functionality is divided into static and dynamic parts.
Dynamic parts are called Run-Time Reconfigurable Modules(RTRMs).
Granularity of partially reconfigurable region (PRR) is directlyrelated to configuration frames.
Three different interfaces are available for reconfigurationJTAG, SelectMAP, ICAP.
A design flow for ”Module based Partial Reconfiguration” isapplied.
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 7/23
Outline
1 Introduction / Goals
2 Run-Time Reconfiguration on FPGAs
3 HyperTransport Cave with Run-Time Reconfiguration Support
4 The Accelerator File System (ACCFS) as Software Framework
5 Case Study - two RTRMs
6 Conclusion
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 8/23
HT-Cave with RTR-Support
To support RTRMs the standard HT-Cave-IP-Core must beenhanced.
The overall design must comply with module based partialdesign flow.
To ease porting the infrastructure to other interconnects, e.g.PCIe, the functionality is divided into:
host interface specific part:HT Cave, HT Packet Enginehost interface independent part:RTRM, RTRM Controller, Reconfig Unit, Internal RoutingUnit
To generate a RTRM bit stream file a framework is providedto the user.
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 9/23
HT-Cave with RTR-Support
Infrastructure of a HT-Cave with RTR-Support:
Internal
Routing
Unit
RTRM
Controller
Reconfig
Unit
HT
Packet
Engine
N
R
P
N
R
P
N
R
P
N
R
P
HT
Cave
Core
RTRM
host interface independent
dynamicstatic
host interface specific
FPGA
HTX
connection
host
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 10/23
Outline
1 Introduction / Goals
2 Run-Time Reconfiguration on FPGAs
3 HyperTransport Cave with Run-Time Reconfiguration Support
4 The Accelerator File System (ACCFS) as Software Framework
5 Case Study - two RTRMs
6 Conclusion
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 11/23
ACCFS-Introduction
ac9
7 d
rive
r
Process Management Virtual File System Virtual Memory Socket
Syscall−API
Applications
Bus Drivers
Block Devices
ext2 ... vfat
logical: virtual:
proc sysfsext3
De
vic
e H
an
dle
r
SP
U
De
vic
e H
an
dle
r
De
vic
e H
an
dle
r
De
vic
e H
an
dle
r
Nvid
ia
AM
D−
Ati
FP
GA
Cle
arS
pe
ed
De
vic
e H
an
dle
r
...
Char
Devaccfs
ACCFS
Disk Controller Drivers
PCIe Host Bridge
Hardware
Vendor Interface
SPE FPGA ...
– ACCFS –Accelerator File System
Open generalized interfacefor integrating acceleratorsinto Linux based systems
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 12/23
ACCFS-Hardware Integration Concepts
Virtualization⇒ Optimize hardware usage
Generic interface⇒ Establish interface based on well known standard: VFS
Separation of functionalities
⇒ Ease integration of new accelerator types in ACCFS
Host initiated DMA⇒ Avoid page translation issues on the accelerator system
Asynchronous context execution
⇒ No need for threading when running multiple instances in parallel
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 13/23
ACCFS-FPGA Usage
Configure Context
Data Exchange
Execute Design
Wait for Finish
Destroy Context
...
Configure FPGA
Validate Request
State Transition
Wait for ’STOP’
Wait for ’STOP’
Create Context
Establish Context
− Validate bit stream
− Programm device− Space available?
− FPGA available?− First initialization− Returns: context descriptor
Application Device Handler
state goes into running
read (ctx/status)
close (ctx)
sys_acc_run
read / write (ctx/*)
write (ctx/config)
sys_acc_create
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 14/23
Outline
1 Introduction / Goals
2 Run-Time Reconfiguration on FPGAs
3 HyperTransport Cave with Run-Time Reconfiguration Support
4 The Accelerator File System (ACCFS) as Software Framework
5 Case Study - two RTRMs
6 Conclusion
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 15/23
Case Study
As prove of concept we implemented 2 different computekernels as RTRMs:
a pattern matcher, which finds patterns in a byte streama Mersenne Twister, which is a pseudo random numbergenerator
A vendor device driver supporting the UoH HTX Virtex-4XC4VFX60 FPGA Card was implemented.
An user application was implemented, which uploads andexchanges the RTRMs during run-time with the use ofACCFS.
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 16/23
Case Study// Pattern Matcher offload function // Mersenne Twister offload functionint matcher_run (void * search_db_in, int db_size int run_compute_kernel (double * results_out, void * patterns_in, int pattern_count, int results_count) { void * results_out, int results_size) { // create context of our FPGA design int ret; int fd_ctx = (int)acc_create("example", V_ID, char bufstatus[12]; D_ID, 0750, NULL); // create context of our static FPGA design // configure the design int fd_ctx = (int)acc_create("example", V_ID, int fd_cfg = openat(fd_ctx, "config", O_WRONLY); D_ID, 0750, NULL); configure_fpga(fd_cfg, MERSENNE_RTRM_BITSTREAM); // configure the design // open memory int fd_cfg = openat(fd_ctx, "config", O_WRONLY); int fd_mem = openat(fd_ctx, "memory/FPGA MEM1", configure_fpga(fd_cfg, MATCHER_RTRM_BITSTREAM); O_RDWR); // open memory and status // allocating buffer int fd_mem = openat(fd_ctx, "memory/FPGA MEM1", int32_t * buffer = (int32_t *) mmap(NULL, O_RDWR); MEM_SIZE, PROT_READ | PROT_WRITE, int fd_status = openat(fd_ctx, "status", MAP_SHARED, fd_mem, 0); O_RDONLY); // fill memory with data (DMA bulk transfer) pwrite(fd_mem, search_db_in, db_size, DB_OFFSET); int32_t * mt32_numbers = buffer + NUMBERS_OFFSET; pwrite(fd_mem, patterns_in, 4 * pattern_count, PATTERN_OFFSET); // start the matcher // start the Mersenne twister MT32 acc_run(fd_ctx, 0); acc_run(fd_ctx, 0); // check status // (wait until context execution finished) // Example C function that uses random numbers read(fd_status, bufstatus, 12); c_kernel_function(results_out, results_count, mt32_numbers); // read results of operation (DMA bulk transfer) // unmap buffer ret = pread(fd_mem, results_out, munmap((void *) buffer, MEM_SIZE); results_size, RESULTS_OFFSET); // close files // close files close(fd_mem); close(fd_status); close(fd_cfg); close(fd_mem); close(fd_cfg); return ret; return 0;} }
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 17/23
Case Study
placed and routedHT-Cave withRTR-support andpattern-matcher asRTRM
Resource utilization ofXC4VFX60:
4 clock regions forHT Cave withRTR-support
12 clock regions forRTRM
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 18/23
Outline
1 Introduction / Goals
2 Run-Time Reconfiguration on FPGAs
3 HyperTransport Cave with Run-Time Reconfiguration Support
4 The Accelerator File System (ACCFS) as Software Framework
5 Case Study - two RTRMs
6 Conclusion
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 19/23
Conclusion
By using the ability of run-time reconfiguration of FPGAs it ispossible to build single FPGA chip solutions for hostcoupled accelerators.
A design of RTR-capable infrastructure was shown whichallows to manage RTR modules during run-time.
The implementation was done for FPGA directly coupled tothe HyperTransport processor bus of the host system.
The software framework ACCFS provides a genericinterface to user applications which is able to satisfy thedemand of RTR computing.
The concept provided is applicable to other processor andperipheral bus coupled FPGAs.
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 20/23
End
The End.
Thank you for your attention!
Questions?
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 21/23
ACCFS-Project Status
ACCFS 0.5 alpha available
http://www.tu-chemnitz.de/cs/ra/projects/accfs
Features
Host support for x86 and x86 64 (ppc32/64 available soon!)
Support for recent Linux kernels
Fully operational VFS interface
Device handler support for UoM HTX Virtex-4 FPGA Card
TODO
Resource discovery interface (via proc or sysfs)
Extend vendor interface for better virtualization support
Other device handlers?: Cell/B.E., Clearspeed, ...
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 22/23
ACCFS Development Road Map 2009
Q4 2008 Q1 2009 Q2 2009 Q3 2009 Q4 2009 Q1 2010
���������������������������������������������������������������
���������������������������������������������������������������
IBM QS21 PCIe coupled
Cell/B.E. SPE integration
Virtualization facilitating functions
ClearSpeed / GPGPUs
Case study ?
Support for
Virtex−5 PCIe board
Extended support for
UoH HTX FPGA card
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 23/23