Upload
jola
View
46
Download
0
Tags:
Embed Size (px)
DESCRIPTION
FPGA based instrumentation for Correlators, Spectrometers, and VLBI (how to build eight radio astronomy instruments in two years). Dan Werthimer University of California, Berkeley. http:// seti.berkeley.edu. Our research group is really 3 groups. - PowerPoint PPT Presentation
Citation preview
FPGA based instrumentation for Correlators, Spectrometers, and VLBI
(how to build eight radio astronomy instruments in two years)
Dan Werthimer Dan Werthimer University of California, BerkeleyUniversity of California, Berkeley
http://seti.berkeley.eduseti.berkeley.edu
Our research group is really 3 Our research group is really 3 groupsgroups• SETI SETI (plus primordial black holes, HI mapping)(plus primordial black holes, HI mapping)
• Public Participation Scientific ComputingPublic Participation Scientific Computing
• CASPER – Center for Astronomy Signal CASPER – Center for Astronomy Signal Processing and Electronics ResearchProcessing and Electronics Research
UC Berkeley SETI ProgramsUC Berkeley SETI ProgramsName Time Scale Search Type
SERENDIP seconds radio sky survey
SETI@home mS - seconds radio sky survey
Astropulse nS - mS radio sky survey
SEVENDIP nS visible targetted
SPOCK 1000 seconds visible targetted
DYSON IR targetted
The SETI@home ClientThe SETI@home Client
5,464,550 participants (in 226 countries)
2,000 per day
2.3 million years computer time
1,200 years per day
4*1021 floating point operations
200 Tera-flops
SETI@home Statistics
TOTAL RATE
Public Participation Supercomputing GroupPublic Participation Supercomputing Group
David Anderson, Rom Walton, SETI GroupDavid Anderson, Rom Walton, SETI Group
• aka Distributed Computingaka Distributed Computing
• aka “edge resource aggregation”)aka “edge resource aggregation”)
BOINC: BOINC: NSFNSF
• Berkeley Open Berkeley Open Infrastructure for Network Infrastructure for Network ComputingComputing
– General-purpose distributed General-purpose distributed computing framework.computing framework.
– Open source.Open source.
– Will make distributed Will make distributed computing accessible to computing accessible to those who need it. (Starting those who need it. (Starting from scratch is hard!)from scratch is hard!)
ProjectsProjects• AstronomyAstronomy
– SETI@home (Berkeley) SETI@home (Berkeley)
– Astropulse (Berkeley)Astropulse (Berkeley)
– Einstein@home: gravitational pulsar search (Caltech,…)Einstein@home: gravitational pulsar search (Caltech,…)
– PlanetQuest (SETI Institute)PlanetQuest (SETI Institute)
– Stardust@home (Berkeley, Univ. Washinton,…)Stardust@home (Berkeley, Univ. Washinton,…)
• Earth scienceEarth science
– Climateprediction.net (Oxford)Climateprediction.net (Oxford)
• Biology/MedicineBiology/Medicine
– Folding@home, Predictor@home (Stanford, Scripts)Folding@home, Predictor@home (Stanford, Scripts)
– FightAIDSathome: virtual drug discoveryFightAIDSathome: virtual drug discovery
• PhysicsPhysics
– LHC@home (Cern)LHC@home (Cern)
• OtherOther
– Web indexing/searchWeb indexing/search
– Internet Resource mapping (UC Berkeley)Internet Resource mapping (UC Berkeley)
Where's the computing power?
●2010: 1 billion Internet-connected PCs
●55% privately owned
● If 100M participate:
– 100 PetaFLOPs, 1 Exabyte (10^18) storage
your computers
academic
business
home PCs
CASPER:CASPER:
Center for Radio Astronomy Signal Processing and Electronics Center for Radio Astronomy Signal Processing and Electronics ResearchResearch
Henry Chen, Daniel Chapman, Pat Crescini, Pierre Droz, Kirsten Henry Chen, Daniel Chapman, Pat Crescini, Pierre Droz, Kirsten Meder, Meder,
Vinayak Nagpal, Arash Parsa, Aaron Parsons, Andrew Siemion, Dan Vinayak Nagpal, Arash Parsa, Aaron Parsons, Andrew Siemion, Dan WerthimerWerthimer
Radio Astronomy Lab: Don Backer, Paul Demorest, Matt Dexter,
Carl Heiles, David McMahon, Mel Wright, Lynn UrryBerkeley Wireless Research Center:
Bob Broderson, Chen Chang, John WawrzynekSETI Institute:
Dave Deboer, Gerry HarpCollaborators:
Jeff Mock, NAIC, NRAO, ATNF, JPL/DSN, Harvard/Smithsonian/CFA, MIT/Haystack, GMRT, Caltech, South Africa KAT
CASPER Real-time Signal Processing CASPER Real-time Signal Processing InstrumentationInstrumentation
(NSF ATI, MRI)(NSF ATI, MRI)• Low NRE, shared by the communityLow NRE, shared by the community
• Rapid development Rapid development (8 instruments / 2 (8 instruments / 2 years)years)
• Open-source, collaborativeOpen-source, collaborative
• Reusable, platform-independent Reusable, platform-independent gatewaregateware
• Modular, upgradeable hardwareModular, upgradeable hardware
• Industry standard communication Industry standard communication protocolsprotocols
• Low CostLow Cost
MOTIVATIONMOTIVATION
ATA, SKA, Focal Plane Arrays, ATA, SKA, Focal Plane Arrays, SETI,SETI,
need >> PetaOp/secneed >> PetaOp/sec
Instruments take a long time to Instruments take a long time to build, very high NREbuild, very high NRE
The Radio RevolutionThe Radio Revolution
Allen Telescope ArrayAllen Telescope Array•6.1-meter offset Gregorian (2.4-meter secondary)
ATA-42 Operational This FallATA-42 Operational This Fall
The Problem with the The Problem with the CurrentCurrentHardware Development Hardware Development ModelModel• Takes 5 yearsTakes 5 years
• Cost Dominated by NRE because Cost Dominated by NRE because of custom Boards, Backplanes, of custom Boards, Backplanes, ProtocolsProtocols
• Antiquated by the time it’s Antiquated by the time it’s released.released.
Solution:Solution:
• Modular HardwareModular Hardware
– Low number of board designsLow number of board designs
– Can be upgraded piecemeal or all Can be upgraded piecemeal or all togethertogether
– ReusableReusable
– Standard signal processing model Standard signal processing model which which
is consistent between upgrades.is consistent between upgrades.
Solution: use FPGA’sSolution: use FPGA’s
1 FPGA = 100 Pentium, 1/500 the power per 1 FPGA = 100 Pentium, 1/500 the power per opop
Computational Density Comparison
1000
10000
100000
1000000
10000000
10/28/1995
3/11/1997
7/24/1998
12/6/1999
4/19/2001
9/1/2002 1/14/2004
Release Date
(MO
PS
/MH
z)*l
am
da
^2 Processor Peak
FPGA 32-bit int MAC
FPGA maximum sustained performance
1
10
100
1000
10000
100000
12/1/1996
6/19/1997
1/5/1998
7/24/1998
2/9/1999
8/28/1999
3/15/2000
10/1/2000
4/19/2001
11/5/2001
5/24/2002
Release date
MO
PS
(3
2 b
it M
AC
)3X improvement3X improvementper year!per year!
Moores Law for FGPA’s
Compute Module DiagramCompute Module Diagram
138 bits 300MHz DDR 41.4Gb/s
4GB DDR2 DRAM12.8GB/s (400DDR)
100BTEthernet
5 FPGAs2VP70FF1704
FPGAFabric
MG
T
Memory Controller
IB4X/CX4 20Gbps
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
FPGAFabric
MG
T
Memory Controller
FPGAFabric
MG
T
Memory Controller
FPGAFabric
MG
T
Memory Controller
FPGAFabric
MGT
Memory Controller
IB4X/CX4 40Gbps
IB4X/CX4 40Gbps
IB4X/CX4 40Gbps
IB4X/CX4 40Gbps
Platform-Independent, Platform-Independent, Parameterized GatewareParameterized Gateware
• What is Gateware?What is Gateware?
– Design logic of FPGAs Design logic of FPGAs
(between hardware and software)(between hardware and software)
• Need libraries for signal Need libraries for signal processing which don’t have to processing which don’t have to be rewritten every hardware be rewritten every hardware generation.generation.
• Matlab Simulink!Matlab Simulink!
Biplex Pipelined FFTBiplex Pipelined FFT
• Uses 1/6 the resources of the Xilinx Uses 1/6 the resources of the Xilinx module.module.
FFT controls FFT controls Simulink Library – Aaron Simulink Library – Aaron ParsonsParsons
Verilog Library – Jeff MockVerilog Library – Jeff Mock
• Transform lengthTransform length
• BandwidthBandwidth
• Complex or RealComplex or Real
• Number of PolarizationsNumber of Polarizations
• Input bit width and output bit widthInput bit width and output bit width
• twiddle coefficient bit widthtwiddle coefficient bit width
• Run-time programmable down-shiftingRun-time programmable down-shifting
• Decimate optionDecimate option
Filter Response:PFB vs. FFT
PFB vs. FFTPFB vs. FFT
Additional PFB controls Additional PFB controls
(Aaron Parsons, Jeff Mock)(Aaron Parsons, Jeff Mock)
• Filter overlapFilter overlap
• Width of filter coefficientsWidth of filter coefficients
• Window function for filter (hamming, hanning, etc.) Window function for filter (hamming, hanning, etc.)
• Import filter coefficients for custom filter performanceImport filter coefficients for custom filter performance
Digital Down-ConverterDigital Down-Converter
• Selectable # of FIR tapsSelectable # of FIR taps
• On-the-fly programmable mix On-the-fly programmable mix frequencyfrequency
• Selectable FIR coeffSelectable FIR coeff
• Agile sub-band selection.Agile sub-band selection.
X-Engine Correlation X-Engine Correlation Architecture (Lynn Urry, Architecture (Lynn Urry, Aaron Parsons)Aaron Parsons)
X-Engine Architecture:X-Engine Architecture:applied to an arbitrary applied to an arbitrary sized antenna arraysized antenna array
Hardware and Software Hardware and Software LibrariesLibrarieslegend:legend:
ApplicationsApplications
Global InterconnectsGlobal Interconnects• Commercial 10GBe switch Commercial 10GBe switch
from HP, Fujitsu, Foundry, from HP, Fujitsu, Foundry, Extreme Networks, Force Extreme Networks, Force 1010– Packet switched, non-Packet switched, non-
blockingblocking
– <= 224 ports (4X) per <= 224 ports (4X) per chassischassis
– Up to 10,000 ports in a Up to 10,000 ports in a systemsystem
– 200~1000 ns switch 200~1000 ns switch latencylatency
– 400~1200 ns FPGA to 400~1200 ns FPGA to FPGA latencyFPGA latency
– ~ 2.88Tbps full duplex ~ 2.88Tbps full duplex constant cross section constant cross section bandwidthbandwidth
– $600 per port$600 per port
ComputeNode
#N
ComputeNode
#1
Infiniband Crossbar Switch
Ethernet Switch
Commercial off-the-shelfMulticast 10 Gbps (10GE
or InfiniBand) Switch
PFBADCFPGA DSP
Module
FPGA DSPModule
FPGA DSPModule
FPGA DSPModule
FPGA DSPModule
FPGA DSPModule
General-purpose CPUs
PFB
PFB
.
.
.
Correlator
Beamformers/Spectrometers
Pulsar timer
.
.
.
ReconfigurableCompute Cluster
ADC
ADC
PolyphaseFilter Banks
.
.
.
.
.
.
Beowulf Cluster Like General Purpose ArchitechtureDynamic Allocation of Resources, need not be FPGA based
Targeted ApplicationsTargeted Applications
• Moderate to high-bandwidth Moderate to high-bandwidth problemsproblems
– For low bandwidths, just use CPUsFor low bandwidths, just use CPUs
• Lower to mid-scale computationLower to mid-scale computation
– For very large applications (SKA), may For very large applications (SKA), may be more cost effective to design ASICsbe more cost effective to design ASICs
• Rapid DevelopmentRapid Development
ApplicationsApplications• VLBI Mark 5B data recorder - Haystack – 500 MHzVLBI Mark 5B data recorder - Haystack – 500 MHz
• Beamforming – SMA – Beamforming – SMA – Vinayak Nagpal, Jonathan WeintroubVinayak Nagpal, Jonathan Weintroub
• SETI – Arecibo (UCB)SETI – Arecibo (UCB)
JPL/UCB DSN JPL/UCB DSN (Preston, Gulkis, Levin, Jones)(Preston, Gulkis, Levin, Jones)
• Correlators and Imagers: Correlators and Imagers:
ATA (Mel Wright)ATA (Mel Wright)
Reionization Experiment (Backer, Bradley…) Reionization Experiment (Backer, Bradley…)
Carma Next Gen (Dave Hawkins, Caltech)Carma Next Gen (Dave Hawkins, Caltech)
SKA demonstrator South Africa (Justin Jonas)SKA demonstrator South Africa (Justin Jonas)
VLBI Digitizer-Channelizer for VLBI Digitizer-Channelizer for Mark5Mark5 Haystack: Shep Doeleman, Brian Fanous, Haystack: Shep Doeleman, Brian Fanous, Alan Rogers, Alan Whitney Alan Rogers, Alan Whitney
UCB: Henry Chen, Aaron Parsons, Pierre DrozUCB: Henry Chen, Aaron Parsons, Pierre Droz• Interfaces to MARK 5 data recorderInterfaces to MARK 5 data recorder
• 500 MHz bandwidth * 2 IF’s 500 MHz bandwidth * 2 IF’s
(Only 1 IF now)(Only 1 IF now)
• 16 or 32 channels per IF 16 or 32 channels per IF
• Polyphase Filter BankPolyphase Filter Bank
VLBI Mark 5B Front EndVLBI Mark 5B Front End 500 MHz BW, 32 channel filter bank 500 MHz BW, 32 channel filter bank
1 GHz bandwidth 1 GHz bandwidth “Pocket Spectrometer”“Pocket Spectrometer”
• Using ATMEL ADC’s at 2 Gsamples/secUsing ATMEL ADC’s at 2 Gsamples/sec
• Performing 4 real FFT’s in 1 (complex) Performing 4 real FFT’s in 1 (complex) biplex pipelined FFT module.biplex pipelined FFT module.
• 2048 channels2048 channels
• Uses just 1 ADC, 1 IBOB, and your Uses just 1 ADC, 1 IBOB, and your laptop.laptop.
128 Million Channel SETI 128 Million Channel SETI SpectrometerSpectrometer
• 200 MHz Bandwidth, 2 Hz resolution200 MHz Bandwidth, 2 Hz resolution
Multi-Purpose SpectrometerMulti-Purpose Spectrometer – – Low Low BandwidthBandwidth Aaron ParsonsAaron Parsons
XilinxVirtex-II 6000
FPGA
XilinxVirtex-II
1000FPGA
256 MB DRAM
200 MhzADC
Compact PCIBackplane
Software
200 MhzADC
200 MhzADC
200 MhzADC
I
I
Q
Q
Pol. 1
Pol. 2
{
{
200 Aux. I/O
SERENDIP V SpectrometerSERENDIP V Spectrometer
SETI ApplicationsSETI Applications
• JPL/UCB/SI DSN Sky Survey (20 GHz Bandwidth)JPL/UCB/SI DSN Sky Survey (20 GHz Bandwidth)
• Parkes Southern SERENDIPParkes Southern SERENDIP
• ALFA Sky Survey (300 MHz x 7 beams)ALFA Sky Survey (300 MHz x 7 beams)
• SETI Italia (Bologna)SETI Italia (Bologna)
• SETI@homeSETI@home
Astronomy ApplicationsAstronomy Applications
• GALFA Spectrometer – Arecibo Multibeam Hydrogen SurveyGALFA Spectrometer – Arecibo Multibeam Hydrogen Survey
• Astronomy Signal Processor – ASP – Don Backer, Ingrid Stairs, et Astronomy Signal Processor – ASP – Don Backer, Ingrid Stairs, et al(pulsars)al(pulsars)
• ATA4 Correlator F Engine ATA4 Correlator F Engine
• Reionization Experiments (Don Backer, Rich Bradley, Chippendale, Reionization Experiments (Don Backer, Rich Bradley, Chippendale, Ekers) Ekers)
• Antenna Holography, ATNF, ChinaAntenna Holography, ATNF, China
• GMRT correlatorGMRT correlator
SERENDIP V
PolyphaseFilter Bank
Serverw/ EDT card
GbESwitch
PC
Serverw/ EDT card
Serverw/ EDT card
Serverw/ EDT card
PCPCPC
PCPC
GbESwitch
PCPC
PCPC
PCPC
GbESwitch
PCPC
PCPC
PCPC
GbESwitch
PCPC
PCPC
PCPC
128 MHz
128 MHz
Pol. 1
Pol. 2
Astronomy Signal Processor: Don Backer, Jeff Mock, Paul Demorest
GALFA SpectrometerGALFA Spectrometer
sin
cos
LPF
LPF
100 MHz
-50 to +50 MHz
sin
cos
LPF
LPF
100 MHz
-50 to +50 MHz
QuadratureDownconverter
Board
IF Pol. 1
IF Pol. 2
Biplex256 pnt.
PFB
e^-it
e^-it
FIRLPF
FIRLPF
12.5 MhzDigital
Decimateby 16
Decimateby 16
Biplex8192 pnt.
PFB
Stokes
Stokes
cPCIBackplan
eto
CPU
Multipurpose Spectrometer Board
Mars Orbiter mm Mars Orbiter mm SpectrometerSpectrometer
ASIC based spectrometer (mars)ASIC based spectrometer (mars)
• 2W/ADC + 2W/ASIC = 4 Watts2W/ADC + 2W/ASIC = 4 Watts
• Use UCB’s “Chip in a Day” softwareUse UCB’s “Chip in a Day” software
(compiles FGPA code into ASIC)(compiles FGPA code into ASIC)
Use rad hard libraries from LBLUse rad hard libraries from LBL
21 lags 300kHz clock
discrete transistors
$19,000
1960 – First Radio Astronomy Digital Correlator
Sandy Weinreb
Correlator processing Correlator processing powerpower
DLB
103
102
10
104
105
106
DXB
70 75 9085 80 95 2000 05 10 2015
VLA
GFlops
1
DCB
LOFARSMA
DAS
EVN/WSRT
107
103
106
109
ALMA
SKA
.EVLA
source: Arnold van Ardenne
Future SETI SpectrometersFuture SETI Spectrometers2015 4 THz 400 beams
10 GHz each2020 128 THz 12,800 beams
2025 4000 THz 40,000 beams
2030 128,000 THz 1M beams
CaveatsCaveats• Risky Risky
• Simulink new, buggy, not open sourceSimulink new, buggy, not open source
(verilog, vhdl old)(verilog, vhdl old)
just a bunch of clever students, just a bunch of clever students,
We’ve built the easy instruments so far,We’ve built the easy instruments so far,
(Not the hard ones), yet to demonstrate (Not the hard ones), yet to demonstrate packetizedpacketized
Correlator and compute clusterCorrelator and compute cluster
CASPER the CASPER the FriendlyFriendly......• Group Helping Open-source Signal-Group Helping Open-source Signal-
processing Technology (GHOST?)processing Technology (GHOST?)
– Goal to help develop signal processing Goal to help develop signal processing instrumenation and libraries for the instrumenation and libraries for the community.community.
– Open source hardware, gateware, and Open source hardware, gateware, and software.software.
– Provide training and tutorialsProvide training and tutorials
– Not so much delivering turn-key Not so much delivering turn-key instruments.instruments.
Selected correlator quotesSelected correlator quotes
Ray Escoffier“With correlator performance having gone up by a factor of 922,000 over the last 30 years, its only fair that correlator design engineers' salaries should have gone up by a similar factor!!”
Sandy Weinreb“In 1960 there were no chips; just discrete transistors! The $19,000 was the cost of the samplers, shift registers, and counter. It did not include the cost of the 21 accumulators which I made myself in a few months getting paid $240/month.”
Sergei Pogrebenko “It is desirable that the output data rate from a data processor is less than the input data rate.”
http://seti.berkeley.eduhttp://seti.berkeley.edu