Upload
thomasine-pearson
View
214
Download
1
Tags:
Embed Size (px)
Citation preview
Department of Particle Physics & Astrophysics
Introduction to the RPT & its concepts.Roadmap to today’s lectures and lessons.
CERN follow-on to the Oxford RPT Workshop
Michael Huffer, [email protected] National Accelerator Laboratory
January, 30, 2014
v7
Representing (here today):Mark ArndtRic Claus Sergio MaldonadoJim PanettaGregg Thayer
Representing (back home):Gunther HallerRyan HerbstSteve TetherTony WaiteMatt Weaver
Department of Particle Physics & Astrophysics
2
Disclaimers• We are here to introduce a set of hardware, firmware & software tools
– You are here to learn how to use it• However, what is being introduced is a work in progress
– When (if) you use these tools you represent the very, very early adopters• Consequently…
– Not all its components are yet in place (we will point out those which aren’t)– Workshop is targeted to software rather then firmware developers– Maturity of some components is not all that could be hoped for– Formal documentation and user support is in an embryonic state– What we leave with you in lectures & lessons is (almost) all the existing
documentation • You are also here for us to give input on:
– The workshop (both its organization & content)– Our documentation & user support plan (Mark will speak of this later)
• The tools we introduce are intended to maximize flexibility (sometimes at the expense of clarity)– With that flexibility comes complexity as well as the need to make choices– One work-shop cannot address and explain all its complexity
• What is introduced prioritizes performance at the expense of friendliness– You will be working in an embedded (cross-development) & real-time
environment – You will require an additional set of skills (and patience!)
• We are TLA (Three-Letter-Acronym) rich (see glossary)• But, have FUN & don’t be afraid to experiment…
– Expect to crash processors & expect to corrupt file systems
Department of Particle Physics & Astrophysics
3
The second TLA (What is RPT?)
• RPT is one component of the DOE program on “Detector R & D”• RPT is a TLA for RCE Platform Technology• What is an RCE?
– (Reconfigurable)-Cluster-Element (A tool to be explained later)• What is a Platform Technology? To quote Wikipedia:
– ”technology that enables the creation of products and processes that support present or future or past development. It establishes the long-term capabilities of research & development institutes. It can be defined as a structural or technological form from which various products can emerge without the expense of a new process/technology introduction.
– In computing platforms, for example, computer hardware serves as platform for an operating system which in turn is a platform for Enterprise Infrastructure Software which in turn is a platform for application software. Transport infrastructure similarly serves as platform for vehicles”.
• Tools to leverage the construction of large-scale, high performance DAQ systems
• We will loosely use RCE as both a tool & a technology– Context should make meaning clear, forgive us but feel free to
complain… Next: The program’s historical background…
Department of Particle Physics & Astrophysics
4
Historical genesis of the RPT…• Intended to provide DAQ/trigger “technology” for next generation HEP
experiments…– One component of ongoing SLAC’s contribution to DOE’s Detector R & D
project• We were in a unique position to not sell an existing DAQ system, but to
instead…– Study past, current & future systems
• B-Experiments (Belle and Babar)• Current Energy frontier (CMS & ATLAS)• Future Energy Frontier (LHC in the upgrade era (ATLAS))• Cosmology, dark energy and matter frontier (LSST)• Intensity & Neutrino frontier (LBNE)
– Treat systems as “use cases” & attempt to “distill” their underlying commonality
• Survey and leverage relevant recent & future industry innovation– System-On-Chip (SOC)– High speed, low cost, low power high speed serialization and
deserialization– Packaging technologies (xTCA and VPX (Vita 46))
• Program was divided into four distinct time phases:– Phase-0, “Survey requirements and capture their commonality”– Phase-1, Technology evaluation, provide demonstration Platform-
Technology (PT)– Phase-2, Lessons learned, provide sustainable architecture with
demonstration PT• Demonstration PT abandoned (overtaken by events)
– Phase-3, Meet performance goals and provide final PT
We are at the end of Phase-3
Next: enumerate some observations obtained from Phase-0…
Department of Particle Physics & Astrophysics
5
Observations
• Detector data is naturally received by a DAQ system on multiple, parallel “channels”
• The properties (physical, electrical & logical) of a channel cannot be dictated by the DAQ system
• Inhomogeneous channels, detector rates & volume, the need to minimize dead-time & ensure deterministic latencies will inevitably force a DAQ system to be partitioned into two domains: – A front-side system demanding real-time support & is (historically)
purpose-built– A back side system which emphasizes commodity processing and
networking• SLAC’s experience & expertise meant it was well positioned to address
the question:– Is there commonality in a front-side system and if so, how can it be
extracted?
Next: the (abstract) properties of a front-end
Department of Particle Physics & Astrophysics
6
Properties of a Front-End system
• Detector data is naturally received by Front-Side on multiple, parallel “channels”
• Data acquired on a channel must be processed:– Tagged, filtered, feature extracted and (potentially) reformatted.
• Processing should occur in parallel to take advantage of detector’s natural parallelism– Processing should be done by multiple, independent computational
elements– Computational elements should scale with detector elements– There is no one “right” model for a computational element (FPGA,
DSPs or CPU)• Parallel processing requires a interconnect for local communication
between elements– Interconnects should scale with computational elements– To leverage industry innovation, interconnect should exploit commodity
networking • Adequate scaling requires close attention to minimizing power, cooling,
footprint & cost – This is true for both for computational elements and interconnects
• Transition to (& isolation from) back-end requires an intraconnect – Allows communication between interconnects as well as back-end– Intraconnect should employ the same networking standard used by
interconnect• The system must allow provision for external, synchronous timing and
trigger • Uptime and maintenance requirements require a commodity, “packaging”
standard
Next: A model of the abstract front-end system…
Department of Particle Physics & Astrophysics
7
The Abstract Front-End System (4 layers)
Four functional layers physically constrained by a packaging standard
adaptation layer allowsimplementation of subsequentlayers to be invariant against detector channel properties
computational layer satisfies requirements from previous slide
Timing also usesadaptation mechanism
RPT provides the packaging as well as the
building blocks to implementthese four layers
Department of Particle Physics & Astrophysics
8
The four building blocks of the RPT
• A Packaging Standard:– Is ATCA (Advanced Telecommunication Computing Architecture)
from PICMG– Crate based with a serial backplane– Backplane is protocol agnostic, We impose 10-Gigabit Ethernet
(XAUI))– Provides (almost completely) the Adaptation mechanism– Provides (almost completely) the implementation of the
Intraconnect– The RPT contributes an ATCA compliant front-board (the COB)
• A Computational Element:• Is the RCE (Reconfigurable-Cluster-Element):– The RCE is based on SOC (System-On-Chip technology)
– Uses any member of the Xilinx Zynq family– For current RCE’s we employ both the Z-7045 & Z-7030
• A CI (Cluster Interconnect):– Is based on 10-Gigabit Ethernet switching – Provides the implementation of an interconnect– Connects one (1) to seventeen (17) RCEs together to form a
Cluster– Provides interface to intraconnect
• Support software:– Both RCE and host based
Next: Drill into each building block…
Department of Particle Physics & Astrophysics
9
ASIS 14-slot shelf with, 40Gb, full-mesh, 13-way backplane
Power supplies
Air exit
Shelf manager
backplane
Fans
One slot on backplane Front Rear
Air entrance
• Backplane defines the topology of the Intraconnect• Each slot contains two boards: The Front-board & RTM (Rear-Transition-
Module)– Boards are physically independent, but logically & electrically deeply
coupled
Department of Particle Physics & Astrophysics
10
Front-Boards & RTMs
Front Board (COB)
RTM
• The COB (Cluster-On-Board) is an ATCA compliant Front-Board
• The RTM is the adaptation layer– Implementation provided by user
A ClusterOf Nine (9) RCEs + CI
External interface to intraconnect
Internal interface to intraconnect
interface to adaptation layer
Department of Particle Physics & Astrophysics
11
The RCE (Reconfigurable-Cluster-Element)
• The SOC is a Xilinx Zynq, it’s… – An FPGA
• Equivalent in resources to large Kintex
• 900 DSP tiles (> 200 TeraMACS/Sec)• 16 MGTs (12.5 Gbits/s SerDes)
– A Processor• ARM (dual-core) A-9 @ 900 MHZ…• FPGA resources, including fabric are
configured & loaded through the processor
• Contains very fat & very rich I/O pipes between processor & FPGA fabric
• Physically it’s a SOC + interfaces + memory…– Consumes between 15 & 20 watts– Footprint is ~45 mm2
– Costs around $500-800– Memory is DDR3 (ephemeral) & SD
(persistent)
Next: But logical view expresses function…
Department of Particle Physics & Astrophysics
12
The RCE (another view)
• Plugins are:– Application specific logic
(firmware)– The XAUI is one example of a
plugin • The CE (Cluster-Element) is…
– The nexus for all communication– Contains the processor, has
external interfaces & manages memory
– Manages the Ethernet (XAUI) plugin
• Plug/Socket interface– Allow for the independent and
concurrent exchange of information between CE & plugins
Next: Drill into the CE and plugins …
• Logically its a CE (Cluster Element) + Plugins– And as well, an interface between
them…
Department of Particle Physics & Astrophysics
13
Carrier boards for RCEs (the DTM & DPM)
DPM (2 RCEs)
SOC (7030)SOC (7045)
SOC (7045)
DDR3
Micro-SD
Micro-SD
DTM (1 RCE)
DDR3
DDR3
Micro-SD
Department of Particle Physics & Astrophysics
14
The CE (Cluster Element)
• ARM (dual-core) A-9 @ 900 MHZ…– 2 Gbytes of DDR3– Micro-SD (removable)– 10-GE MAC
• Bootstrap configuration– External I2C interface– Communicates with IPMC– Provides localization information
• Socket side of Plug/Socket Interface– Interface to application specific
logic– Uses all 4 high speed AXIs + ACP– Transfers data in units of frames– Receives/transmits concurrently– Provides up to 4 Gigabytes/s of
bandwidth
Next: Examine frames & plugins …
Department of Particle Physics & Astrophysics
15
Frames are the unit of data exchange between plugin & CE
• Frame length…– Is measured in units of bytes (8-bits)– Varies from one byte to a plugin specified
maximum• Inbound frames
– Originate in the plugin and be delivered to the CE
• Outbound frames – Originate in the CE and be delivered to the
plugin
• Frames contain two (2) variable sized components…– The header (begins at first byte of frame and ends at first byte of payload)– The payload (begins following last byte of header and ends at last byte of frame)
• Either header or payload may have a length of zero (0)• Frame processing software expects lengths are discoverable either statically or
within the header.• Header and payload are contiguous in the plugin and dis-contiguous in memory• Headers are placed in the processor’s cache coherency domain
– Their contents are automatically cached on reception & automatically flushed on transmission
• Indirection of header and payload allows:– Software placement of payload, naturally allowing “zero-copies” of payload – Overlapping of header and payload processing
Department of Particle Physics & Astrophysics
16
Frames are exchanged using a Plug & Socket model
• The user:– Creates designs and implements application specific logic– Wraps this logic with an RPT provided core called the “Plug”
• The combination of logic and plug is the “Plugin”.• Plugins are connected to Sockets on the CE• Connection is triggered by loading user bit-file (partitioning and dynamic
configuration) • Sockets are exposed to the processor through a set of I/O registers (the port)• RPT provides a software wrapper around the port:
– The SAS (Socket Abstraction Services)• These services are used by application to implement a specific software plugin
Next: Data into the RCE has been covered how about out?…
These two steps are a firmware process
(not to be discussed further in this workshop)
Department of Particle Physics & Astrophysics
17
The Interconnect and the cluster it serves
• A cluster contains…– From 1-17 RCEs each with 10G
Ethernet– Interface to intraconnect (up to 23
ports)– Cluster-Interconnect (CI) – One RCE always designed to manage
cluster
• CI (Cluster Interconnect) is a 24 port, 10-GE, Layer-2 switch– CI is a 1536-pin ASIC (Intel FM2224)– Consumes ~30 watts– Cut through (200 NS ingress/Egress)– Fully provisioned, for example… – Pause frames, link aggregation, VLAN etc…
Department of Particle Physics & Astrophysics
18
The COB’s Cluster (DTM & DPM)
• The DTM (Data-Transport-Module) RCE– Just one, Manages the cluster– Operates under Linux– Its customization is not a topic
today
• Two interfaces to intraconnect (10 GE)– Off the board through front-panel
• Two (2) channels of SFP+– Between boards using ATCA P2
fabric• Thirteen (13) channels of
XAUI
• The DPM (Data-Processing-Module) RCE– There can be up to eight (2
today)– Connected to adaptation layer – Processes detector data– Its customization is a topic for
today
Next: The internal intraconnect (the ATCA backplane)
Department of Particle Physics & Astrophysics
19
4 clusters (internal intraconnect strategies)
4 clusters, Star intraconnect 4 clusters, Mesh intraconnect
Next: Software
Department of Particle Physics & Astrophysics
20
Software Development model
• Autonomous execution environment– An Operating System (O/S) executing application code on an RCE– Automatic startup on processor reset– No human interaction– Execution images and configuration data located on local file-
system (SD)• Development environment
– Is host based (Redhat 5/6 & Scientific Linux)– In short you practice “cross development”
• Code is compiled and linked on a host for distribution to a target• Development and execution environments are bridged through TCP/IP
– Code must be moved from host to local file-system • What is being compiled and linked?
– Sharable images– Connection between image location on target is keyed to sharable
name•
Department of Particle Physics & Astrophysics
21
The SDK and its distribution
• All software distributed in the form of an SDK (Software Development Kit)– Cross-development tools– Header files + appropriate core sharables– Examples (but not all you see today)– Host based discovery, IPMI & target tools
• SDKs are downloaded to a host via GIT– Hosts are Redhat 5/6 and/or Scientific LINUX
• Targets are updated via an SDK – Update does not effect application specific code on the target
• There are three different types of SDKs (one for each type of target)– RCE/RTEMS– RCE/LINUX– HOST/LINUX
• We are not in the business of providing a build system…– SDK tools are designed to be build system “neutral”– We expect you to incorporate our tools into your build system
Department of Particle Physics & Astrophysics
22
Operating Systems
• None! (bare metal)– But not recommended…
• RTEMs (we’ve added value to the kernel)– Open Source, Real-Time kernel– POSIX compliant interfaces– Added value:
• File system support (SD drive and NFS)• Dynamic linker/loader• Services (system logging & Telnet for example)• Configuration management (Symbol/Value tables)
• Arch-Linux– Default O/S on DTM– Not the focus of this workshop, but will instead focus on RTEMs
• Both O/S include a fully provisioned TCP/IP stack– DTM IP address assignment is static, but rule based– DPM IP address assignment assumes external DHCP server
Department of Particle Physics & Astrophysics
23
Today’s (workshop) Roadmap
• You have a majority of the RPT developers in-house!– Opportunity for break-outs tomorrow on specific topics…
• Lectures in the morning:– ATCA tutorial & COB overview– Software tools (installation and usage)– Shelf installation & operation (networking)– Embedded, RCE based software development– Roadmap for user support
• “Hands-on” lessons in the afternoon (all lessons build on their predecessor)– Dancing with the hardware– Dancing with the software (“hello world”)– Discovery services (how to address the RCE logically)
• Pseudo ARP (Address Resolution Protocol)– Developing plugin software using the 10-GE plugin
• Pseudo UDP (Uses ARP)• Command/Response to pseudo detector (detector space
interaction)
Gregg
Jim
Gregg again
Sergio
Mark
Gregg
Jim
Sergio
Jim
Department of Particle Physics & Astrophysics
Role Responsibilities Needed expertise See…
Shelf Administrator
Define and configure shelf. Define & install networking. Maintain shelf hardware.
ATCA, networking & system administration
Gregg’s lecture & lesson
Software administrator
Install RTEMS. Install & maintain SDks.
System administration Jim’s lecture & lesson
Plugin user Develop application against existing plugins
Real time programming & networking
all
Plugin Architect
Define requirements. Determine division between firmware & software. Supervise plugin developers.
Firmware and software design
Not addressed by today’s introductory workshop (topic of dedicated Plugin workshop)
Plugin developer (firmware)
Implement firmware side of plugin
Digital engineering Not addressed by today’s introductory workshop (topic of dedicated Plugin workshop)
Plugin developer (software)
Implement software side of plugin
Real-time programming
Sergio’s lecture & lesson
RTM developer
Develop application specific adaptation layer
Digital engineering (schematic, layout & fabrication)
Gregg’s lecture & lesson24
Roles & responsibilities matrix
This is what we expect to be the
most common role
Department of Particle Physics & Astrophysics
25
Questions?
• Thanks to our hosts!– Andreas, Jaya John & Sue