10
XPS Project Report Automatically Scalable Computation (ASC) Jonathan Appavoo, Steve Homer, & Ajay Joshi Boston University Tommy Unger, Tomislav Petrovic, & Schuyler Eldridge Margo Seltzer, Ryan P. Adams, & David Brooks Harvard University Amos Waterland, & Yu (Emma) Wang

XPS Project Report Automatically Scalable Computation (ASC)synergy.cs.vt.edu/.../reports/Jonathan_Appavoo_3-xpsprojectreport.pdf · XPS Project Report Automatically Scalable Computation

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: XPS Project Report Automatically Scalable Computation (ASC)synergy.cs.vt.edu/.../reports/Jonathan_Appavoo_3-xpsprojectreport.pdf · XPS Project Report Automatically Scalable Computation

XPS Project Report Automatically Scalable

Computation (ASC)Jonathan Appavoo, Steve Homer, & Ajay Joshi Boston University

Tommy Unger, Tomislav Petrovic, & Schuyler Eldridge Margo Seltzer, Ryan P. Adams, & David Brooks Harvard University

Amos Waterland, & Yu (Emma) Wang

Page 2: XPS Project Report Automatically Scalable Computation (ASC)synergy.cs.vt.edu/.../reports/Jonathan_Appavoo_3-xpsprojectreport.pdf · XPS Project Report Automatically Scalable Computation

HW

SW

SpeedSpeed SW Performance

Scaling SW Performance as function of a physical parameter

Page 3: XPS Project Report Automatically Scalable Computation (ASC)synergy.cs.vt.edu/.../reports/Jonathan_Appavoo_3-xpsprojectreport.pdf · XPS Project Report Automatically Scalable Computation

A FUNDAMENTAL QUESTION

HW

SW

SizeSpeed SW Performance

Size

Can we scale software performance as a function of system size (eg. transistor count — not necessarily cpus)?

Page 4: XPS Project Report Automatically Scalable Computation (ASC)synergy.cs.vt.edu/.../reports/Jonathan_Appavoo_3-xpsprojectreport.pdf · XPS Project Report Automatically Scalable Computation

Exposing and Exploiting Structure in Execution

0

Execution as n bit "video" signal of system state

...

... n-1

...... ...

One row of pixels is a frame of execution and describes entire system state at a particular step of instruction execution

...

One column of pixels for each bit of system state (register, ram and memory-mapped I/O devices)

initial state

future state

current state

Execution "video" proceeds from top to bottom

Page 5: XPS Project Report Automatically Scalable Computation (ASC)synergy.cs.vt.edu/.../reports/Jonathan_Appavoo_3-xpsprojectreport.pdf · XPS Project Report Automatically Scalable Computation

Intuition

Page 6: XPS Project Report Automatically Scalable Computation (ASC)synergy.cs.vt.edu/.../reports/Jonathan_Appavoo_3-xpsprojectreport.pdf · XPS Project Report Automatically Scalable Computation

An Approach

Page 7: XPS Project Report Automatically Scalable Computation (ASC)synergy.cs.vt.edu/.../reports/Jonathan_Appavoo_3-xpsprojectreport.pdf · XPS Project Report Automatically Scalable Computation

Making things tractable

(a) original learning problem

initial state

(d) new learning problem

?

?

?

current state

possible future state

(b) recognized hyperplane

(c) hyperplane intersections

?

observed trajectory

state space (schematic)

hyperplane induced by

predictive probability

Page 8: XPS Project Report Automatically Scalable Computation (ASC)synergy.cs.vt.edu/.../reports/Jonathan_Appavoo_3-xpsprojectreport.pdf · XPS Project Report Automatically Scalable Computation

A Simple Example• Search for a prime

factor of a number

• in this case 9223371994482243049

/* factor.c */ #include <stdio.h> #include <stdlib.h>

static long invert(long n) { register long p;

for (p = 3; p * p <= n; p += 2) if ((n % p) == 0) return p;

return n; }

int main(int argc, char *argv[]) { long p, n = 9223371994482243049;

if (argc > 1) n = atol(argv[1]); p = invert(n); #if 0 printf("p = %ld\n", p); #endif return p; }

Amos Waterland

Page 9: XPS Project Report Automatically Scalable Computation (ASC)synergy.cs.vt.edu/.../reports/Jonathan_Appavoo_3-xpsprojectreport.pdf · XPS Project Report Automatically Scalable Computation

Seconds

20

30

40

50

60

$ time ./factor > out $ for ((i=0;i<10;i++)); do time ./factor > out; done $ time ./asc factor > out $ time ./asc -a 40013f -i 23068684 factor > out $ for ((i=0;i<10;i++)); do time ./asc -a 40013f -i 23068684 factor.net factor > out; done $ ./asc factor.* > out.train & sleep 30; kill %1 ./asc:main.c:233: batch training `factor.net' on `factor.train' ... $ time ./asc -a 40013f -i 23068684 factor.net factor > out

37.853s (29.4%) 79kB Net File

53.632

Page 10: XPS Project Report Automatically Scalable Computation (ASC)synergy.cs.vt.edu/.../reports/Jonathan_Appavoo_3-xpsprojectreport.pdf · XPS Project Report Automatically Scalable Computation

Status and DirectionsOS : Linux ASC Processes — ASC exec flag, Address Space Management, “Fat” Binary

Theory : Limits, Hardness and Complexity implications

Compiler : Language Level Hyperplanes, Approximation Opportunities and state space profile information

HW : Neuromophic Acceleration, Hyperplane Support, State profiling, State Cache Acceleration

OS

Theory Compiler

HW