CWG4 – The data model• The group proposes a time frame - based data model to:
– Formalize the access to data types produced by both detector FEE and data processing stages by prepending a generic Multiple Data Header
– Provide strict memory management while minimizing the need for copying data for processing purposes (data service instead of “copy around”)• direct access to the data in memory without any additional processing
– Use efficient data layouts allowing for fast navigation among data types and sources and usage of data from vectorized algorithms
• Ongoing investigation and prototyping of efficient AOD formats– Flat vs. hierarchical object structures and the impact on processing speed and data
compression– Investigation on I/O and compression and the output of synchronous reconstruction to be
discussed with CWG7 (reconstruction)• Future work: integration simulation and benchmark
– Realistic raw time frame simulation (CWG8) + time frame aggregation (CWG4) + FLP to EPN flow (CWG3) + concurrency model and platforms (CWG5) down to EPN reconstruction -> To be done in CWG13
CWG4 - Data model
Multiple Data Header• FLP would add a common header type for all data blocks: raw
and produced @ FLP (MDH)– Common part
• Unique HW ID (FLP/EPN)+ version ID• Summary info for what follows (partly extracted from the single data
header (SDH))– Data type, number of blocks, block length, status
• Used for navigation in the time frame & unique identification– Specific part
• Relevant SDH info for fast navigation (error bits, fired trigger, see CDH now)
– Transient block address table (for DDL data coming in sync)• Make data blocks look the same
The new generic data block: extension of the current schema
• All data blocks produced by both FEE cards or arbitrary processing tasks on FLP (e.g. cluster finding) to be described as generic MDB blocks. A MDH is foreseen to point to several correlated “events” coming asynchronously on different links on the same FLP. Events will have a sub-frame structure (like today)
• Processing of MDB blocks is transparent to the node type (FLP, EPN)• EPN’s will process MDB blocks but not required to produce MDB at their turn but
rather the persistent event format.
CWG4 - Data model
Data block typesType=Heartbeat
HW ID = CTPHB global counterHB local counters
Orbit/BXNb. Of blocks
Requested actions: start run, pause,
resume, end
Type=FEE blockHW ID=equipment
Orbit/BXSize
Nb of blocksStatus bits
SDH(CDH)+PAYLOAD
Type=ClustersSW ID = clusterizer
versionSize
Nb. of blocksStatus bits
SDH+PAYLOAD
Type=TriggerHW ID = CTP
Orbit/BX
SizeNb. of blocks
Status bits
SDH+PAYLOAD
Heartbeat ~ time stamp + commands (i.e. start, pause, continue, stop)
CWG4 - Data model
Data management - FLP
Linki
Linki+1
HBn
HBn
HBn+1
HBn+1
MDHType HBID #12345
Nblocks 10
Link #1 addr1Link #2 addr2
…
&buffer(link1)
&buffer(link2)
BLi(t,t+dt)
BLi+1(t,t+dt)
MDHType RAWID #12346
Nblocks 10
Link #1 addr3Link #2 addr4
…
Local processing
Serialize to EPN
Minimize searches on EPN for synchronized blocks
For continuous readout it can be just the same for data reads correlated in time
Offset in buffer
The time frame data
• The time frames will start and end with O2 “heartbeat” MDH (events) and embed all data blocks collected by a given FLP. The corresponding frames will have to be aggregated on a EPN node in a folder-like structure easy to browse by reconstruction algorithms. The fast (synchronous) persistent reconstruction format will have to achieve the required overall compression.
• Note that the HBE (“heart beat event”) summary may be attached to the “end HBE” to allow for asynchronous dispatching of blocks before the frame is fully aggregated by the FLP