Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Perf
orm
an
ce p
er
ST
E(*
10
00
0)
Baseline AP/SpAP
0
1
2
3
4
5
6
Sp
eed
up
0.1% 1%
Architectural Support for Efficient Large-Scale Automata ProcessingHongyuan Liu* Mohamed Ibrahim* Onur Kayiran† Sreepathi Pai‡ Adwait Jog*
*College of William & Mary †AMD ‡University of Rochester
1- Automata Processing
2- Challenges & Opportunities
5- Summary
4- Efficient Automata Processing on AP
3- Potential Benefits & Research Questions
▪ Observation: Repeated configurations and executions on AP whichcauses inefficiency
▪ Goal: Accelerate large-scale NFA processing on AP+ Demonstrate that a large number of NFA states are Cold during
execution but still configured to AP+ Predict if a state is Cold or Hot @ compile time using a small
profiling input+ Propose topological-order based NFA partitioning into Predicted
Cold and Predicted Hot states+ Develop SparseAP to handle mispredictions efficiently using our
proposed Enable and Jump operations
▪ Results+ 2.1x Speedup (up to 47x)
Potential SolutionRemove Cold states from the NFAs
Configure ONLY the Hot states to AP
Used widely in different areas
Von Neumann architectures are not efficient at FSM processing- Irregular memory accesses- Limited Parallelism
Solution: Use Automata Processor (AP)
- Enables in-memory processing
- Exploits state parallelism of NFAs
Applications are getting Bigger
AP capacity is Limited
Challenge: Repeated Executions!
Time
Opportunity: Underutilization of APPattern mismatch Many unused states are configured to AP
0%20%40%60%80%
100%
CA
V4
k
DS
CA
V
DS0
3
DS0
6
Sno
rt_L
DS0
9
Sno
rt
HM
15
00
HM
50
0
HM
10
00
HM
PEN TC
P
Rg1 EM ER
Rg0
5
Ferm
i
Pro
Bri
ll
LV
Bro
21
7
SPM
RF1
RF2P
erc
en
tage
of
Stat
es Hot (Enabled) Cold (Never-enabled)
Time
Oracular knowledge of input Arbitrary states partitioning
Question#1:How to predict Cold states?
Question#2:How to partition NFAs?
Mispredictions
Question#3:How to handle mispredictions efficiently?
Decrease Batches
We acknowledge the support of the National Science Foundation (NSF) grants (#1657336, #1717532, #1750667)
MICRO 2018
Q1: How to predict Cold states?
47x
1.8x2.1x
32.1%
Q2: How to partition NFAs?
Q3: How to handle mispredictions efficiently?
Tra
inin
gTe
sting
Solution: Use a small profiling input to predict
the Hot/Cold states
% from Input
% from Training
Accuracy
50% 100% 97%
10% 20% 93%
1% 2% 90%
0.1% 0.2% 87%
Oracular knowledge of input
Time
Input
Predicted
Hot Set
Predicted
Cold Set
u
v
v’
Generated Intermediate
Report(s)
Problem: Input stream execution on the predicted Cold set is too Expensive
v; Cycle x
Solution: SparseAP Execution ModeSpAP = Jump Op + Enable Op
Generated Intermediate Report(s)
Arbitrary states partitioning
Solution: Partition using Topological Order
Correlates with Cold and Hot states
Makes transition unidirectional 0%20%40%60%80%
100%
CA
V4
k
DS
CA
V
DS0
3
DS0
6
Sno
rt_L
DS0
9
Sno
rt
HM
15
00
HM
50
0
HM
10
00
HM
PEN TC
P
Rg1 EM ER
Rg0
5
Ferm
i
Pro
Bri
ll
LV
Bro
21
7
SPM
RF1
RF2P
erc
en
tage
of
Stat
es
Shallow Medium Deep