Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
TDDD56 Lesson 1 August Ernstsson 2017
General information
• Labs performed in groups of two(or ask if you have a specific reason for exception)
• Attendance is mandatory
• Need an IDA student account for lab computers
TDDD56 Lesson 1 August Ernstsson 2017
General information• Two lab rooms
• Southfork
• Multicore Lab”Konrad Zuse”
TDDD56 Lesson 1 August Ernstsson 2017
General information• Examination by demonstration
• Prepare well
• Show/run program, show results (plots), and describe each step
• Explain as if the assistant does not know about the lab skeleton beforehand
• No reports is nice but means that a good demonstration is crucial!
• Work in autonomyExperiment with the given code, Makefiles, parameters etc.
• You can discuss between groups, but no copying!
TDDD56 Lesson 1 August Ernstsson 2017
Assistants
• August Ernstsson: CPU + GPU labs, Groups A+C
• Lu Li: CPU + GPU labs, Groups A+B
• Ingemar Ragnemalm: GPU labs, Group B
TDDD56 Lesson 1 August Ernstsson 2017
Groups
• Group A: Large group in Southfork
• Group B: Small group in Multicore Lab
• Group C: Small group in Multicore Lab
• Sign up in WebReg as soon as possible!
TDDD56 Lesson 1 August Ernstsson 2017
Lab Structure
CPU
Lab 1 Load Balancing
Lab 2 Non-blocking Data Structures
Lab 3 High-level Parallel Programming
GPU
Lab 4 CUDA 1
Lab 5 CUDA 2
Lab 6 OpenCL
Resp
onsi
ble:
Aug
ust
Resp
onsi
ble:
Inge
mar
TDDD56 Lesson 1 August Ernstsson 2017
Lab 1: Load Balancing• Working with threads (Pthreads)
on multicore CPU
• Mandelbrot fractal image generation
• Test if a complex number is in the Mandelbrot set
• For those interested in the maths, check out:
• https://en.wikipedia.org/wiki/Mandelbrot_set
• https://www.youtube.com/watch?v=NGMRB4O922I
Lab 1: Load balancingParallelize the generation of the graphic representation of asubset of Mandelbrot set.
Figure: A representation of the Mandelbrot set (black area) in the range�2 Cre 0.6 and �1 Cim 1.
Nicolas Melot nicolas.melot (at) liu.se (LIU) TDDD56 lesson 1 November 4, 2015 3 / 40
TDDD56 Lesson 1 August Ernstsson 2017
Mandelbrot Algorithmint is_in_Mandelbrot(float Cre, float Cim){ int iter; float x=0.0, y=0.0, xto2=0.0, yto2=0.0, dist2; for (iter = 0; iter <= MAXITER; iter++) { y = x * y; y = y + y + Cim; x = xto2 − yto2 + Cre; xto2 = x * x; yto2 = y * y; dist2 = xto2 + yto2; if ((int)dist2 >= MAXDIV) break; // diverges } return iter; }
Lab 1: Load balancingParallelize the generation of the graphic representation of asubset of Mandelbrot set.
Figure: A representation of the Mandelbrot set (black area) in the range�2 Cre 0.6 and �1 Cim 1.
Nicolas Melot nicolas.melot (at) liu.se (LIU) TDDD56 lesson 1 November 4, 2015 3 / 40
TDDD56 Lesson 1 August Ernstsson 2017
Load Balancing
• Each image pixel is an independent unit of work
• => embarrassingly parallel!
• However, all pixels are not equal amount of work!
• Load balancing becomes a problem.
Lab 1: Load balancingParallelize the generation of the graphic representation of asubset of Mandelbrot set.
Figure: A representation of the Mandelbrot set (black area) in the range�2 Cre 0.6 and �1 Cim 1.
Nicolas Melot nicolas.melot (at) liu.se (LIU) TDDD56 lesson 1 November 4, 2015 3 / 40
TDDD56 Lesson 1 August Ernstsson 2017
Lab 1
• Goal for the lab:
• Implement a solution with near-equal load
• Try different approaches
• Utilize properties of the domain
• How well will your solution work in a general case?
Lab 1: Load balancingParallelize the generation of the graphic representation of asubset of Mandelbrot set.
Figure: A representation of the Mandelbrot set (black area) in the range�2 Cre 0.6 and �1 Cim 1.
Nicolas Melot nicolas.melot (at) liu.se (LIU) TDDD56 lesson 1 November 4, 2015 3 / 40
TDDD56 Lesson 1 August Ernstsson 2017
Lab 2: Non-blocking Stack
• Working with Pthreads on multicore CPU
• Using atomic operations (CAS)
• Implementing efficient parallel data structures
A B C
TDDD56 Lesson 1 August Ernstsson 2017
Unbounded Stacks
• Stacks implemented as linked lists
• Non-blocking: NO LOCKS!
• Push and Pop operations with atomic instructions
A B C
TDDD56 Lesson 1 August Ernstsson 2017
Compare-and-Swap• Do atomically:
• If pointer != old pointer: do nothingElse: swap pointer to new pointer
• Typically used only for compare + assign, no swap
CAS(void** pointer, void* old, void* new) { atomic { if(*pointer == old) *pointer = new; } return old; }
TDDD56 Lesson 1 August Ernstsson 2017
CAS for Stack• Push
• Keep track of old head
• Set new elements next pointer to old head
• Atomically:
• Compare current head with saved old head
• If still equal, set list head to new element
do { old = head; elem.next = old; } while(CAS(head, old, elem) != old);
TDDD56 Lesson 1 August Ernstsson 2017
CAS push
old_head
B C
A
head
set new elements next pointer to old head
TDDD56 Lesson 1 August Ernstsson 2017
CAS push, success
old_head
B C
A
head
still equal?
==
start atomic operation
TDDD56 Lesson 1 August Ernstsson 2017
CAS push, success
old_head
B C
A
head
end atomic operation
set list head to new element
TDDD56 Lesson 1 August Ernstsson 2017
CAS push, failure
old_head
B C
A
head
still equal?
==
start atomic operation
X
TDDD56 Lesson 1 August Ernstsson 2017
ABA problem• List elements can be re-used
• Memory is limited, pointers can reappear => still low risk
• Improve performance by keeping a pool of unused list elements => much greater risk of re-use!
• What if a list element is
• popped,
• pushed (with new content),
• during the non-atomic part of a Pop?
TDDD56 Lesson 1 August Ernstsson 2017
ABA problem
old_head
B
Chead
A
old_nextpool
thread 1 pops A, thread 2 pops B
TDDD56 Lesson 1 August Ernstsson 2017
ABA problem
old_head
B
Chead A
old_nextpool
thread 0 resumes pop; enters atomic region: compares head and old_head
==
what is the problem here?
TDDD56 Lesson 1 August Ernstsson 2017
ABA problem
old_head
B
Chead
A
old_nextpool
A is popped, setting head to old_next (B)
stack points into pool!
elements have leaked!
TDDD56 Lesson 1 August Ernstsson 2017
Lab 2• Goal for the lab:
• Implement non-blocking unbounded stack
• Use atomic operations
• Study the ABA problem
• Detect it or force it to occur
• Can it be avoided?
TDDD56 Lesson 1 August Ernstsson 2017
Lab 3:High-level parallel programming• New lab for this year, old lab material outdated
• High-level parallel programming abstracts architecture details
• Skeleton programming in SkePU
• Inspired by functional programming
• Multi-backend (including GPUs)
• More information in Lesson 2!
TDDD56 Lesson 1 August Ernstsson 2017
Deadlines
• CPU labs: soft deadline 23 nov 2017
• We will not prioritize CPU labs after this date
• All labs: hard dealine last lab session (13-14 dec 2017)
TDDD56 Lesson 1 August Ernstsson 2017
Final Remarks• Sign up in WebReg now, deadline Friday 3/11
• Labs start next week. Aim for one lab done per week.
• Introductory exercises (”lab 0”) on course webpage on C, Pthreads
• Lab material still being updated:
• Labs 1, 2 same as last year. Lab 3 is all new.
• For questions on GPU labs, contact Ingemar.
TDDD56 Lesson 1 August Ernstsson 2017
FINAL remark
• If you know how to do version control (you should),
•Use it! • Commit often, push at the end of the lab session
• Rollback bad changes, recover from data loss
• Share between group members!
• IDA has a GitLab server with web interface