Upload
mayam-ayo
View
224
Download
0
Embed Size (px)
Citation preview
8/13/2019 Chalmers 2011
1/146
Eliminating the Hardware/Software Divide
Satnam Singh, Microsoft Research Cambridge, UK
8/13/2019 Chalmers 2011
2/146
!
IRQ, NMI
8/13/2019 Chalmers 2011
3/146
8/13/2019 Chalmers 2011
4/146
8/13/2019 Chalmers 2011
5/146
8/13/2019 Chalmers 2011
6/146
8/13/2019 Chalmers 2011
7/146
8/13/2019 Chalmers 2011
8/146
8/13/2019 Chalmers 2011
9/146
8/13/2019 Chalmers 2011
10/146
8/13/2019 Chalmers 2011
11/146
8/13/2019 Chalmers 2011
12/146
8/13/2019 Chalmers 2011
13/146
8/13/2019 Chalmers 2011
14/146
locks
monitors
condition variablesspin locks
priority inversion
8/13/2019 Chalmers 2011
15/146
8/13/2019 Chalmers 2011
16/146
8/13/2019 Chalmers 2011
17/146
8/13/2019 Chalmers 2011
18/146
8/13/2019 Chalmers 2011
19/146
8/13/2019 Chalmers 2011
20/146
multiple
independent
multi-ported
memories
fine-grain
parallelism
and
pipelining
hard and softembedded
processors
8/13/2019 Chalmers 2011
21/146
LUTs are just higher order functions
i o
lut1
oi1
i0
lut2 lut3
i0
i1
i2
o
lut4
i0
i1
i2i3
o
inv = lut1not
and2 = lut2(&&)
mux = lut3(ls d0 d1 . ifs thend1 elsed0)
8/13/2019 Chalmers 2011
22/146
XC6VLX760 758,784 logic cells, 864 DSP blocks,
1,440 dual ported 18Kb RAMs
32-bit
integer
Adder
(32/474,240)
>700MHz
332x1440
14820 sim-adds
1,037,400,000,000additions/second
8/13/2019 Chalmers 2011
23/146
8/13/2019 Chalmers 2011
24/146
XD2000i FPGA in-socketaccelerator for Intel FSB XD2000F FPGA in-socketaccelerator for AMD socket F XD1000 FPGA co-processormodule for socket 940
8/13/2019 Chalmers 2011
25/146
8/13/2019 Chalmers 2011
26/146
Case StudySpam Filtering(Alessandro Forin, MSR Redmond)
Benchmark
~50,000 regular expressions fromForefront Team (snapshot from
their Exchange server in Aug 09)
Performance Up to 6000x faster than standard Intel processors
Capable of processing at line rate of gigabit Ethernet
Power Requirement
7
10 watts rather than 200++ watts
8/13/2019 Chalmers 2011
27/146
27
Software Version FPGA Version
~6000 Messages/Sec~1 Message/Sec
8/13/2019 Chalmers 2011
28/146
8/13/2019 Chalmers 2011
29/146
8/13/2019 Chalmers 2011
30/146
Ren Mller (ETH)
FPGAs + SQL [VLDB]
8/13/2019 Chalmers 2011
31/146
CPU FPGA
8/13/2019 Chalmers 2011
32/146
8/13/2019 Chalmers 2011
33/146
8/13/2019 Chalmers 2011
34/146
Speed Grade -1 -2 -3With Layout 270MHz 320MHz 362MHzWithout Layout 210MHz 260MHz 280MHz
541 seconds
1896 seconds
8/13/2019 Chalmers 2011
35/146
opportunityscientific computingdata mining
search
image processing
financial analytics
challenge
8/13/2019 Chalmers 2011
36/146
The Accidental Semi-colon
8/13/2019 Chalmers 2011
37/146
publicstaticint[] SequentialFIRFunction(int[] weights, int[] input)
{
int[] window = newint[size];
int[] result = newint[input.Length];
// Clear to window of x values to all zero.
for(intw = 0; w < size; w++)
window[w] = 0;
// For each sample...
for(inti = 0; i < input.Length; i++){
// Shift in the new x value
for(intj = size - 1; j > 0; j--)
window[j] = window[j - 1];
window[0] = input[i];
// Compute the result value
intsum = 0;
for(intz = 0; z < size; z++)
sum += weights[z] * window[z];
result[i] = sum;
}
returnresult;
}
8/13/2019 Chalmers 2011
38/146
PLDI 1998
8/13/2019 Chalmers 2011
39/146
PLDI 2003
1
2
3
4
5
8/13/2019 Chalmers 2011
40/146
PLDI 2010
0
10
20
30
40
50
60
70
80
1 2 3 4 5 6
Series1
Series2
8/13/2019 Chalmers 2011
41/146
POPL 1998
8/13/2019 Chalmers 2011
42/146
POPL 2002
8/13/2019 Chalmers 2011
43/146
POPL 2010
8/13/2019 Chalmers 2011
44/146
8/13/2019 Chalmers 2011
45/146
ray of light
Signal
Liquid
Metal
PRET-C
Bluespec
Feldspar
Accelerator
RapidMind /Ct
Streams-C
Esterel
SHIM
8/13/2019 Chalmers 2011
46/146
universallanguage?
embeddedhigh level
software
FPGA
GPU
DSP
machine
learning
grand unification
theorypolygots
Gannet
DSLs
8/13/2019 Chalmers 2011
47/146
Our High Level Synthesis Projects
Kiwi: concurrent
C# programs for
control-oriented
applications
[David Greaves,
Univ. Cambridge]
shape analysis:
synthesis of
dynamic data
structures (C)[MPI and CMU]
Accelerator/FPGA:
synthesis of data
parallel programs in
C++/C#/F#[MSR Redmond]
HLINQ
eDSLs
[Gavin Bierman]
+ compilation of self-recursive Haskell functions to FPGA circuits!
8/13/2019 Chalmers 2011
48/146
Redmond Accelerator TeamBarry Bond
Kerry Hammil
Lubomir Litchev
8/13/2019 Chalmers 2011
49/146
Effort vs. Reward
loweffort
low
reward
higheffort
high
reward
mediumeffort
medium
reward
CUDA
OpenCL
HLSL
DirectComputeAccelerator
8/13/2019 Chalmers 2011
50/146
Accelerator
8/13/2019 Chalmers 2011
51/146
Application.EXE Accelerator.DLL
Windows on Intel/AMD
processor
DX9
GPU
(ATI, Nvidia, )
8/13/2019 Chalmers 2011
52/146
openSystemopenMicrosoft.ParallelArraysletmain(args) =
letx = newFloatParallelArray (Array.map float32 [|1; 2; 3; 4; 5|])lety = newFloatParallelArray (Array.map float32 [|6; 7; 8; 9; 10|])letz = x + yusedx9Target = newDX9Target()letzv = dx9Target.ToArray1D(z)printf "%A\n"zv
0
8/13/2019 Chalmers 2011
53/146
openSystemopenMicrosoft.ParallelArraysletmain(args) =
letx = newFloatParallelArray (Array.map float32 [|1; 2; 3; 4; 5|])lety = newFloatParallelArray (Array.map float32 [|6; 7; 8; 9; 10|])letz = x + yusesse3Target = newX64MulticoreTarget()letzv = sse3Target.ToArray1D(z)printf "%A\n"zv
0
8/13/2019 Chalmers 2011
54/146
openSystemopenMicrosoft.ParallelArrays
letmain(args) =letx = newFloatParallelArray (Array.map float32 [|1; 2; 3; 4; 5|])lety = newFloatParallelArray (Array.map float32 [|6; 7; 8; 9; 10|])letz = x + yusefpgaTarget = newFPGAMulticoreTarget()fpgaTarget.ToArray1D(z)0
8/13/2019 Chalmers 2011
55/146
8/13/2019 Chalmers 2011
56/146
8/13/2019 Chalmers 2011
57/146
[1; 2; 3; 4; 5]
FloatParallelArray
CPU Address Space
F# Array
GPU Address Space
Encapsulated
Data-parallel
array
x
[6; 7; 8; 9; 10]
FloatParallelArray
y
100010101101011010
x+yGPU code
GPU memory
GPU code
y
[7; 9; 11; 13; 15]
F# Array
8/13/2019 Chalmers 2011
58/146
usingSystem;
usingMicrosoft.ParallelArrays;
namespaceAddArraysPointwise{
classAddArraysPointwiseDX9{
staticvoidMain(string[] args)
{varx = newFloatParallelArray(new[] {1.0F, 2, 3, 4, 5});vary = newFloatParallelArray(new[] {6.0F, 7, 8, 9, 10});vardx9Target = newDX9Target();varz = x + y;foreach(variindx9Target.ToArray1D (z))
Console.Write(i+ " ");Console.WriteLine();}
}}
8/13/2019 Chalmers 2011
59/146
moduleMainwhere
importAccelerator
x = fpa [1.0, 2.0, 3.0, 4.0, 5.0]y = fpa [6.0, 7.0, 8.0, 9.0, 10.0]
z = x + y
main= dodx9Target
8/13/2019 Chalmers 2011
60/146
8/13/2019 Chalmers 2011
61/146
8/13/2019 Chalmers 2011
62/146
8/13/2019 Chalmers 2011
63/146
8/13/2019 Chalmers 2011
64/146
8/13/2019 Chalmers 2011
65/146
rX *
pa
Shift
(0,0)k[0]
+
+
*
Shift
(0,1) k[1]
+
letrecconvolve (shifts : int ->int [])
(kernel : float32 []) i(a : FloatParallelArray)= lete = kernel.[i] * ParallelArrays.Shift(a, shifts i)ifi = 0thene
else
e + convolve shifts kernel (i-1) a
8/13/2019 Chalmers 2011
66/146
8/13/2019 Chalmers 2011
67/146
8/13/2019 Chalmers 2011
68/146
8/13/2019 Chalmers 2011
69/146
8/13/2019 Chalmers 2011
70/146
publicstaticint[] SequentialFIRFunction(int[] weights, int[] input)
{
int[] window = newint[size];int[] result = newint[input.Length];
// Clear to window of x values to all zero.
for(intw = 0; w < size; w++)
window[w] = 0;
// For each sample...
for(inti = 0; i < input.Length; i++){
// Shift in the new x value
for(intj = size - 1; j > 0; j--)
window[j] = window[j - 1];
window[0] = input[i];
// Compute the result value
intsum = 0;for(intz = 0; z < size; z++)
sum += weights[z] * window[z];
result[i] = sum;
}
returnresult;
}
8/13/2019 Chalmers 2011
71/146
8/13/2019 Chalmers 2011
72/146
8/13/2019 Chalmers 2011
73/146
shift (x, 0) = [7, 2, 5, 9, 3, 8, 6, 4] = x
shift (x, -1) = [7, 7, 2, 5, 9, 3, 8, 6]
shift (x, -2) = [7, 7, 7, 2, 5, 9, 3, 8]
8/13/2019 Chalmers 2011
74/146
y = [y[0], y[1], y[2], y[3], y[4], y[5], y[6], y[7]]
= a[0] * [x[0], x[1], x[2], x[3], x[4], x[5], x[6], x[7]] +a[1]* [x[-1], x[0], x[1], x[2], x[3], x[4], x[5], x[6]] +
a[2] * [x[-2], x[-1], x[0], x[1], x[2], x[3], x[4], x[5]] +
a[3]* [x[-3], x[-2], x[-1], x[0], x[1], x[2], x[3], x[4]] +
a[4]* [x[-4], x[-3], x[-2], x[-1], x[0], x[1], x[2], x[3]]
y= a[0] * shift (x, 0) +
a[1] * shift (x, -1) +
a[2] * shift (x, -2) +
a[3] * shift (x, -3) +a[4] * shift (x, -4)
8/13/2019 Chalmers 2011
75/146
usingMicrosoft.ParallelArrays;usingA= Microsoft.ParallelArrays.ParallelArrays;namespaceAcceleratorSamples{
publicclassConvolver{
publicstaticfloat[] Convolver1D(TargetcomputeTarget,
float[] a, float[] x){
varxpar = newFloatParallelArray(x);varn = x.Length;varypar= newFloatParallelArray(0.0f, new[] { n });for(inti= 0; i< a.Length; i++)
ypar+= a[i] * A.Shift(xpar, -i);float[] result = computeTarget.ToArray1D(ypar);returnresult;
}}
}
for(inti= 0; i< a.Length; i++)ypar+= a[i] * A.Shift(xpar, -i);
8/13/2019 Chalmers 2011
76/146
8/13/2019 Chalmers 2011
77/146
usingSystem;usingSystem.Linq;usingMicrosoft.ParallelArrays;namespaceAcceleratorSamples{
staticclassConvolver2D{
staticFloatParallelArrayconvolve(thisFloatParallelArraya,Func shifts, float[] kernel)
{returnkernel
.Select((k, i) => k * ParallelArrays.Shift(a, shifts(i)))
.Aggregate((a1, a2) => a1 + a2);}staticFloatParallelArrayconvolveXY(thisFloatParallelArrayinput, float[] kernel){
returninput
.convolve(i => new[] { -i, 0}, kernel)
.convolve(i => new[] { 0, -i }, kernel);}staticvoidMain(string[] args){
constintinputSize = 10;varrandom = newRandom(42);varinputData = newfloat[inputSize, inputSize];for(introw= 0; row< inputSize; row++)
for(intcol= 0; col< inputSize; col++)inputData[row, col] = (float)random.NextDouble() * random.Next(1, 100);
vartestKernel = new[] { 2F, 5, 7, 4, 3};vardx9Target = newDX9Target();varinputArray = newFloatParallelArray(inputData);varresult = dx9Target.ToArray2D(inputArray.convolveXY(testKernel));for(varrow= 0; row< inputSize; row++){
for(intcol= 0; col< inputSize; col++)Console.Write("{0} ", result[row, col]);
Console.WriteLine();}
}}
}
staticFloatParallelArrayconvolve(thisFloatParallelArraya,Func shifts,float[] kernel)
{returnkernel.Select((k, i) => k * ParallelArrays.Shift(a, shifts(i)))
.Aggregate((a1, a2) => a1 + a2);}
staticFloatParallelArrayconvolveXY(thisFloatParallelArrayinput,float[] kernel)
{ returninput
.convolve(i => new[] { -i, 0}, kernel)
.convolve(i => new[] { 0, -i }, kernel);}
8/13/2019 Chalmers 2011
78/146
openSystemopenMicrosoft.ParallelArrays[]letmain(args) =
// Declare a filter kernel for the convolution lettestKernel = Array.map float32 [| 2; 5; 7; 4; 3|]// Specify the size of each dimension of the input arrayletinputSize = 10// Create a pseudo-random number generatorletrandom = Random (42)
// Declare a psueduo-input data arraylettestData = Array2D.init inputSize inputSize (funi j ->float32 (random.NextDouble() *
float (random.Next(1, 100))))
// Create an Accelerator float parallel array for the F# input array usetestArray = new FloatParallelArray(testData)// Declare a function to convolve in the X or Y direction letrecconvolve (shifts : int ->int []) (kernel : float32 []) i (a : FloatParallelArray)
= lete = kernel.[i] * ParallelArrays.Shift(a, shifts i)ifi = 0then
eelse
e + convolve shifts kernel (i-1) a// Declare a 2D convolverletconvolveXY kernel input
= // First convolve in the X direction and then in the Y directionletconvolveX = convolve (funi ->[| -i; 0|]) kernel (kernel.Length - 1) inputletconvolveY = convolve (funi ->[| 0; -i |]) kernel (kernel.Length - 1) convolveXconvolveY
// Create a DX9 target and use it to convolve the test input usedx9Target = newDX9Target()letconvolveDX9 = dx9Target.ToArray2D (convolveXY testKernel testArray)printfn "DX9: -> \r\n%A"convolveDX90
letconvolveXY kernel input= // First convolve in the X direction and then in Y
letconvolveX = convolve (funi ->[| -i; 0|]) kernel(kernel.Length - 1) inputletconvolveY = convolve (funi ->[| 0; -i |]) kernel
(kernel.Length - 1) convolveXconvolveY
8/13/2019 Chalmers 2011
79/146
0
5
10
15
20
25
0 5 10 15 20 25 30 35 40 45
speedupo
veronec
ore
kernel size
x64 multicore target benchmark for 2D convolver
(24 core server Xeon E7540)
6 core speedup
12 core speedup
18 core speedup
24 core speedup
8/13/2019 Chalmers 2011
80/146
8/13/2019 Chalmers 2011
81/146
Convolver
8/13/2019 Chalmers 2011
82/146
8/13/2019 Chalmers 2011
83/146
8.249ns max delay
3 x DSP48Es
63 slice registers
24 slice LUTs
8/13/2019 Chalmers 2011
84/146
8/13/2019 Chalmers 2011
85/146
8/13/2019 Chalmers 2011
86/146
8/13/2019 Chalmers 2011
87/146
8/13/2019 Chalmers 2011
88/146
8/13/2019 Chalmers 2011
89/146
DRAM
8/13/2019 Chalmers 2011
90/146
technology node 130nm CMOS
(2006)
45nm CMOS
(2008)
transfer 32b
across-chip
20 computations 57 computations
transfer 32boff-chip 260 computations 1300 computations
Power of Computation vs. Communication
numbers derived from work by W. Dally, Stanford
8/13/2019 Chalmers 2011
91/146
8/13/2019 Chalmers 2011
92/146
8/13/2019 Chalmers 2011
93/146
Virtex-6 XC6VLX240T-1
317MHz
Area 6%
108 DSP48E1 (out of a768)
2.7W (at 25C)
1,110mega-samples per second
cf. CUDA version on 470GTX at
552 mega-samples per second
(single precision)
8/13/2019 Chalmers 2011
94/146
Kiwi Thesis
thread
2
thread
3
thread
1
8/13/2019 Chalmers 2011
95/146
8/13/2019 Chalmers 2011
96/146
http://upload.wikimedia.org/wikipedia/fr/b/b4/Mono_project_logo.svg8/13/2019 Chalmers 2011
97/146
http://www.amazon.com/gp/product/images/0470052627/ref=dp_image_0?ie=UTF8&n=283155&s=books8/13/2019 Chalmers 2011
98/146
8/13/2019 Chalmers 2011
99/146
8/13/2019 Chalmers 2011
100/146
http://msdn.microsoft.com/en-us/devlabs/dd491992.aspx8/13/2019 Chalmers 2011
101/146
Kiwi
structural imperative (C)parallel
imperative
gate-level
VHDL/Verilog KiwiC-to-
gates
&0
0
0
Q
QSET
CLR
S
R
;
;
;
jpeg.cthread
2
thread
3
thread
1
8/13/2019 Chalmers 2011
102/146
8/13/2019 Chalmers 2011
103/146
8/13/2019 Chalmers 2011
104/146
8/13/2019 Chalmers 2011
105/146
Kiwi
Library
Kiwi.cs
circuit
model
JPEG.cs
Visual Studio
multi-thread simulation
debuggingverification
Kiwi Synthesis
circuit
implementation
JPEG.v
8/13/2019 Chalmers 2011
106/146
parallel
program
C#
Thread 1
Thread 2
Thread 3
Thread 3
C to
gates
C to
gates
C to
gates
C to
gates
circuit
circuit
circuit
circuit
Verilog
for system
8/13/2019 Chalmers 2011
107/146
Ports and Clockspublicst ticclass I2C
{ [OutputBitPort("scl")]
st ticbool scl;[InputBitPort("sda_in")]
st ticbool sda_in;[OutputBitPort("sda_out")]
st ticbool sda_out;[OutputBitPort("rw")]
st ticbool rw;
circuit portsidentified by
custom attribute
8/13/2019 Chalmers 2011
108/146
publicst ticint max2(int a, int b){ int result;if(a > b)
result = a;elseresult = b;
returnresult;}
.method public hidebysig staticint32max2(int32 a,
int32 b) cil managed{// Code size 12 (0xc).maxstack 2
.locals init ([0] int32 result)IL_0000: ldarg.0IL_0001: ldarg.1IL_0002: ble.s IL_0008
IL_0004: ldarg.0
IL_0005: stloc.0IL_0006: br.s IL_000a
IL_0008: ldarg.1IL_0009: stloc.0IL_000a: ldloc.0IL_000b: ret
}
max2(3, 7)
stack
local memory
0
3
7
7
7
8/13/2019 Chalmers 2011
109/146
Writing to a Channel
publicclassChannel
{
T datum;
boolempty = true;
publicvoidWrite(T v)
{
lock(this)
{
while(!empty)
Monitor.Wait(this);
datum = v;empty = false;
Monitor.PulseAll(this);
}
}
8/13/2019 Chalmers 2011
110/146
Reading from a Channel
publicT Read()
{
T r;
lock(this)
{
while(empty)
Monitor.Wait(this);
empty = true;
r = datum;
Monitor.PulseAll(this);}
returnr;
}
8/13/2019 Chalmers 2011
111/146
8/13/2019 Chalmers 2011
112/146
systems level concurrency constructs
threads, events, monitors, condition variables
rendezvous join patternstransactional
memory
data
parallelism
userapplications
domain specificlanguages
8/13/2019 Chalmers 2011
113/146
8/13/2019 Chalmers 2011
114/146
Filter Example
thread one-place
channel
8/13/2019 Chalmers 2011
115/146
Transposed Filter
8/13/2019 Chalmers 2011
116/146
8/13/2019 Chalmers 2011
117/146
Inter-thread Communication and
Synchronization
// Create the channels to link together the taps
for(intc = 0; c < size; c++){
Xchannels[c] = newKiwi.Channel();
Ychannels[c] = newKiwi.Channel();
Ychannels[c].Write(0); // Pre-populate y-channel registers with zeros
}
8/13/2019 Chalmers 2011
118/146
// Connect up the taps for a transposed filter
for(inti = 0; i < size; i++)
{
intj = i; // Quiz: why do we need the local j?
ThreadtapThread = newThread(delegate() { Tap(j, weights[j],
Xchannels[j],
Ychannels[j],
Ychannels[j+1]); });
tapThread.Start();}
8/13/2019 Chalmers 2011
119/146
8/13/2019 Chalmers 2011
120/146
8/13/2019 Chalmers 2011
121/146
8/13/2019 Chalmers 2011
122/146
t ti bli id h ()
8/13/2019 Chalmers 2011
123/146
staticpublicvoidecho(){
tx_sof_n = !false; // We are not at the start of a frametx_src_rdy_n = !false;
tx_eof_n = !false; // We are not at the end of a frameboolstart = !rx_sof_n && !rx_src_rdy_n; // The start conditioninti, j;booldoneReading;
while(true) // Process packets indefinately{
// Wait for SOF and SRC_RDYwhile(!start)
{Kiwi.Pause(); // Wait for a clock tickstart = !rx_sof_n && !rx_src_rdy_n; // Check for start of frame
}// Read in the entire framei = 0;doneReading = false;
// Read the remaining bytes
while(!doneReading){
if(!rx_src_rdy_n){
buffer[i] = rx_data;i++;
}doneReading = !rx_eof_n;Kiwi.Pause();
}
8/13/2019 Chalmers 2011
124/146
8/13/2019 Chalmers 2011
125/146
C#
softprocessor
8/13/2019 Chalmers 2011
126/146
8/13/2019 Chalmers 2011
127/146
fib :: Int -> Intfib 0 = 0fib 1 = 1
fib n= n1 + n2wheren1 = fib (n
1)n2 = fib (n - 2)
8/13/2019 Chalmers 2011
128/146
STATE 1FREEPRECASE
ds1 := dsCASEds1WHEN1 =>RETURN 1
WHEN0 =>RETURN0
WHENothers =>v0 := ds1 - 2RECURSE[v0] 2 [ds1]
ENDCASESTATE 3 FREE n2n1 := resultInt
v2 := n1 + n2RETURNv2
STATE 2 FREE ds1n2 := resultIntv1 := ds1 - 1RECURSE[v1] 3 [n2]
8/13/2019 Chalmers 2011
129/146
8/13/2019 Chalmers 2011
130/146
8/13/2019 Chalmers 2011
131/146
8/13/2019 Chalmers 2011
132/146
8/13/2019 Chalmers 2011
133/146
8/13/2019 Chalmers 2011
134/146
8/13/2019 Chalmers 2011
135/146
8/13/2019 Chalmers 2011
136/146
relocation viavirtualization???
8/13/2019 Chalmers 2011
137/146
+ encryption + virtualization +
data-processing
no standard ABIno FPGA-kernel-userspace model
The cloud is just an extension ofexisting OS paradigms FPGAs getleft behind they lack abstractionboundaries
Split Trust
8/13/2019 Chalmers 2011
138/146
Split Trust
managingphysical devicevs.
usinga physical devicemanagement
domain?
8/13/2019 Chalmers 2011
139/146
FPGAs Improve Cloud Security
8/13/2019 Chalmers 2011
140/146
8/13/2019 Chalmers 2011
141/146
8/13/2019 Chalmers 2011
142/146
8/13/2019 Chalmers 2011
143/146
8/13/2019 Chalmers 2011
144/146
8/13/2019 Chalmers 2011
145/146
8/13/2019 Chalmers 2011
146/146