Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
CS:APP
CISC 360CISC 360Midterm Exam - ReviewMidterm Exam - Review
AlignmentAlignment
Michela TauferOctober 14, 2008
Powerpoint Lecture Notes for Computer Systems: A Programmer'sPerspective, R. Bryant and D. O'Hallaron, Prentice Hall, 2003
– 2 – CISCʼFa08
Alignment
Aligned DataAligned Data Primitive data type requires K bytes Address must be multiple of K Required on some machines; advised on IA32
treated differently by Linux and Windows!
Motivation for Aligning DataMotivation for Aligning Data Memory accessed by (aligned) double or quad-words
Inefficient to load or store datum that spans quad wordboundaries
Virtual memory very tricky when datum spans 2 pages
CompilerCompiler Inserts gaps in structure to ensure correct alignment of
fields
– 3 – CISCʼFa08
Specific Cases of AlignmentSize of Primitive Data Type:Size of Primitive Data Type:
1 byte (e.g., char) no restrictions on address
2 bytes (e.g., short) lowest 1 bit of address must be 02
4 bytes (e.g., int, float, char *, etc.) lowest 2 bits of address must be 002
8 bytes (e.g., double) Windows (and most other OSʼs & instruction sets):
» lowest 3 bits of address must be 0002 Linux:
» lowest 2 bits of address must be 002» i.e., treated the same as a 4-byte primitive data type
12 bytes (long double) Linux:
» lowest 2 bits of address must be 002» i.e., treated the same as a 4-byte primitive data type
– 4 – CISCʼFa08
struct S1 { char c; int i[2]; double v;} *p;
Satisfying Alignment with StructuresOffsets Within StructureOffsets Within Structure
Must satisfy elementʼs alignment requirement
Overall Structure PlacementOverall Structure Placement Each structure has alignment requirement K
Largest alignment of any element Initial address & structure length must be
multiples of K
Example (under Windows):Example (under Windows): K = 8, due to double elementc i[0] i[1] vp+0 p+4 p+8 p+16 p+24
Multiple of 4 Multiple of 8
Multiple of 8 Multiple of 8
– 5 – CISCʼFa08
Linux vs. Windows
Windows (including Windows (including CygwinCygwin):): K = 8, due to double element
Linux:Linux: K = 4; double treated like a 4-byte data type
struct S1 { char c; int i[2]; double v;} *p;
c i[0] i[1] v
p+0 p+4 p+8 p+16 p+24
Multiple of 4 Multiple of 8Multiple of 8 Multiple of 8
c i[0] i[1]p+0 p+4 p+8
Multiple of 4 Multiple of 4Multiple of 4
vp+12 p+20
Multiple of 4
– 6 – CISCʼFa08
Overall Alignment Requirementstruct S2 { double x; int i[2]; char c;} *p;
struct S3 { float x[2]; int i[2]; char c;} *p;
ci[0] i[1]
p+0 p+12p+8 p+16 p+20
x[0] x[1]
p+4
p must be multiple of: 8 for Windows4 for Linux
p must be multiple of 4 (in either OS)
p+0 p+12p+8 p+16 Windows: p+24Linux: p+20
ci[0] i[1]x
– 7 – CISCʼFa08
Ordering Elements Within Structurestruct S4 { char c1; double v; char c2; int i;} *p;
struct S5 { double v; char c1; char c2; int i;} *p;
c1 iv
p+0 p+20p+8 p+16 p+24
c2
c1 iv
p+0 p+12p+8 p+16
c2
10 bytes wasted space in Windows
2 bytes wasted space
– 8 – CISCʼFa08
Arrays of StructuresPrinciplePrinciple
Allocated by repeating allocationfor array type
In general, may nest arrays &structures to arbitrary depth
a[0]
a+0
a[1] a[2]
a+12 a+24 a+36• • •
a+12 a+20a+16 a+24
struct S6 { short i; float v; short j;} a[10];
a[1].i a[1].ja[1].v
– 9 – CISCʼFa08
Union AllocationPrinciplesPrinciples
Overlay union elements Allocate according to largest element Can only use one field at a time
union U1 { char c; int i[2]; double v;} *up;
ci[0] i[1]
vup+0 up+4 up+8struct S1 {
char c; int i[2]; double v;} *sp;
c i[0] i[1] vsp+0 sp+4 sp+8 sp+16 sp+24
(Windows alignment)
CS:APP
CISC 360CISC 360Computer ArchitectureComputer ArchitectureLogic DesignLogic Design
Michela TauferOctober 14, 2008
Powerpoint Lecture Notes for Computer Systems: A Programmer'sPerspective, R. Bryant and D. O'Hallaron, Prentice Hall, 2003
– 11 – CISCʼFa08
Overview of Logic DesignFundamental Hardware RequirementsFundamental Hardware Requirements
Communication How to get values from one place to another
Computation Storage
Bits are Our FriendsBits are Our Friends Everything expressed in terms of values 0 and 1 Communication
Low or high voltage on wire Computation
Compute Boolean functions Storage
Store bits of information
– 12 – CISCʼFa08
Digital Signals
Use voltage thresholds to extract discrete values fromcontinuous signal
Simplest version: 1-bit signal Either high range (1) or low range (0) With guard range between them
Not strongly affected by noise or low quality circuit elements Can make circuits simple, small, and fast
Voltage
Time
0 1 0
– 13 – CISCʼFa08
Computing with Logic Gates
Outputs are Boolean functions of inputs Respond continuously to changes in inputs
With some, small delay
a
bout
a
bout a out
out = a && b out = a || b out = !a
And Or Not
Voltage
Time
a
ba && b
Rising Delay Falling Delay
– 14 – CISCʼFa08
Combinational Circuits
Acyclic Network of Logic GatesAcyclic Network of Logic Gates Continously responds to changes on primary inputs Primary outputs become (after some delay) Boolean
functions of primary inputs
Acyclic Network
PrimaryInputs
PrimaryOutputs
– 15 – CISCʼFa08
Bit Equality
Generate 1 if a and b are equal
Hardware Control Language (HCL)Hardware Control Language (HCL) Very simple hardware description language
Boolean operations have syntax similar to C logical operations Weʼll use it to describe control logic for processors
Bit equala
b
eqbool eq = (a&&b)||(!a&&!b)
HCL Expression
– 16 – CISCʼFa08
Word Equality
32-bit word size HCL representation
Equality operation Generates Boolean value
b31Bit equal
a31
eq31
b30Bit equal
a30
eq30
b1Bit equal
a1
eq1
b0Bit equal
a0
eq0
Eq
=B
A
Eq
Word-Level Representation
bool Eq = (A == B)
HCL Representation
– 17 – CISCʼFa08
Bit-Level Multiplexor
Control signal s Data signals a and b Output a when s=1, b when s=0
Bit MUX
b
s
a
out
bool out = (s&&a)||(!s&&b)
HCL Expression
– 18 – CISCʼFa08
Word Multiplexor
Select input word A or Bdepending on control signal s
HCL representation Case expression Series of test : value pairs Output value for first
successful test
Word-Level Representation
HCL Representation
b31
s
a31
out31
b30
a30
out30
b0
a0
out0
int Out = [ s : A; 1 : B;];
s
B
AOutMUX
– 19 – CISCʼFa08
HCL Word-Level Examples
Find minimum of threeinput words
HCL case expression Final case guarantees
matchA
Min3MIN3BC
int Min3 = [ A < B && A < C : A; B < A && B < C : B; 1 : C;];
D0
D3
Out4
s0s1
MUX4D2D1
Select one of 4 inputsbased on two controlbits
HCL case expression Simplify tests by
assuming sequentialmatching
int Out4 = [ !s1&&!s0: D0; !s1 : D1; !s0 : D2; 1 : D3;];
Minimum of 3 Words
4-Way Multiplexor
– 20 – CISCʼFa08
OFZFCF
OFZFCF
OFZFCF
OFZFCF
Arithmetic Logic Unit
Combinational logic Continuously responding to inputs
Control signal selects function computed Corresponding to 4 arithmetic/logical operations in Y86
Also computes values for condition codes
ALU
Y
X
X + Y
0
ALU
Y
X
X - Y
1
ALU
Y
X
X & Y
2
ALU
Y
X
X ^ Y
3
A
B
A
B
A
B
A
B
– 21 – CISCʼFa08
Registers
Stores word of data Different from program registers seen in assembly code
Collection of edge-triggered latches Loads input on rising edge of clock
I O
Clock
DC Q+
DC Q+
DC Q+
DC Q+
DC Q+
DC Q+
DC Q+
DC Q+
i7i6i5i4i3i2i1i0
o7
o6
o5
o4
o3
o2
o1
o0
Clock
Structure
– 22 – CISCʼFa08
Register Operation
Stores data bits For most of time acts as barrier between input and output As clock rises, loads input
State = xRisingclock
Output = xInput = yx
State = y
Output = yy
– 23 – CISCʼFa08
State Machine Example
Accumulatorcircuit
Load oraccumulate oneach cycle
Comb. Logic
ALU
0
OutMUX
0
1
Clock
InLoad
x0 x1 x2 x3 x4 x5
x0 x0+x1 x0+x1+x2 x3 x3+x4 x3+x4+x5
Clock
Load
In
Out
– 24 – CISCʼFa08
Random-Access Memory
Stores multiple words of memory Address input specifies which word to read or write
Register file Holds values of program registers %eax, %esp, etc. Register identifier serves as address
» ID 8 implies no read or write performed Multiple Ports
Can read and/or write multiple words in one cycle» Each has separate address and data input/output
Registerfile
A
B
W dstW
srcA
valA
srcB
valB
valW
Read ports Write port
Clock
– 25 – CISCʼFa08
Register File TimingReadingReading
Like combinational logic Output data generated based on
input address After some delay
WritingWriting Like register Update only as clock rises
Registerfile
A
B
srcA
valA
srcB
valB
y2
Registerfile
W dstW
valW
Clock
x2Risingclock Register
fileW dstW
valW
Clock
y2
x2
x
2
– 26 – CISCʼFa08
Hardware Control Language Very simple hardware description language Can only express limited aspects of hardware operation
Parts we want to explore and modify
Data TypesData Types bool: Boolean
a, b, c, … int: words
A, B, C, … Does not specify word size---bytes, 32-bit words, …
StatementsStatements bool a = bool-expr ; int A = int-expr ;
– 27 – CISCʼFa08
HCL Operations Classify by type of value returned
Boolean ExpressionsBoolean Expressions Logic Operations
a && b, a || b, !a Word Comparisons
A == B, A != B, A < B, A <= B, A >= B, A > B Set Membership
A in { B, C, D }» Same as A == B || A == C || A == D
Word ExpressionsWord Expressions Case expressions
[ a : A; b : B; c : C ] Evaluate test expressions a, b, c, … in sequence Return word expression A, B, C, … for first successful test
– 28 – CISCʼFa08
SummaryComputationComputation
Performed by combinational logic Computes Boolean functions Continuously reacts to input changes
StorageStorage Registers
Hold single words Loaded as clock rises
Random-access memories Hold multiple words Possible multiple read or write ports Read word when address input changes Write word as clock rises
– 29 – CISCʼFa08
Deadlines
7 Oct 14 Lec11 – Logic Design Lab 2
7 Oct 16 Lec12 – Sequential Implementation
8 Oct 21 Lec13 – Pipelined Implementation I
8 Oct 23 Lec14 – Pipelined Implementation II
8 (*) Oct 24 Exercises in class in preparation to exam
9 Oct 28 Lec15 – Computer Architecture – Wrap-
up
9 Oct 30 Exam II
10 Nov 4 No class
10 Nov 6 Lec16 – Program Optimization I 5 4 Lab 3