Upload
phungnga
View
215
Download
0
Embed Size (px)
Citation preview
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Agenda
• Introduction• Achievements• Course Development• TBB• Resources for curriculum building
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Introduction
• Multi-core Architecture & Multi-threaded Programming– Electronic Information College, Wuhan
Unviersity– 3 credits, 72 hours– 6th semester
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
MOE-Intel Model Curriculum, 2007
software design based on multi-coreHu ChenSouth China univ. of Technology
12
Compiler PrincipleHuowang ChenNational Univ. of Defense Technology
11
Multi-core Architecture & Multi-threaded ProgrammingJianfeng YangWuhan Univ.10
High Performance Processor ArchitectureHong AnUniversity of Science and Technology of China
9
Multi-core ComputingTianzhou ChenZhejiang Univ.8
Parallel programming Based On multi-core processorBin LuoNanjing Univ.7
Principle of OSHuijuan ZhangTongji Univ.6
Computer Architecture Based on Multi-core ProcessorJinjiao LiNortheast Univ.5
Parallel ComputingJizhou SunTianjing Univ.4
Principle of OSDan WangBeijing Univ. of Industry
3
Advanced Computer ArchitectureWeimin ZhengTSinghua Univ.2
Parallel Program DesignHuashan YuPeking Univ.1
Curriculum NamePrincipalUniversityNo
http://www.jpkcnet.com/new/zhengce/Announces_detail.asp?Announces_ID=94
Achievements
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Multi-core Training Workshop
• April 10-17, 2008• Participants
• 46 faculties from22 PRC Universities
Achievements
http://softwareblogs-zho.intel.com/2008/05/06/125/
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Intel Cup Embedded Contest
• 1st-Class Award – Vehicle Real-time Image Panorama System
• 2nd-Class Award– Intelligent Target Tracking and Photos Taking
System • 3rd-Class Award
– Video-based Safety Guardianship Aid
Achievements
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Teaching Groups CombinationCourse Development
Embedded System
TG
Multi-core Architecture & Multi-threaded Programming
TG
TG
combination
•Different Concepts & Design Methods
•Similar programming modules & tools
•Intend for the design of Embedded System with Multi/many -core processor
•One lab room
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Applications
• Packet processing• Image processing / searching• Data clustering with large data base• …
Course Development
Real-time system(May be I/O bound) Large scale computing
Control Oriented Computing Oriented
Mixed solution
More Parallelism or Concurrency will be BetterMore Parallelism or Concurrency will be Better
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
We use…
• Intel Compilers• Intel VTune Performance Analyzer• Threading Analysis
– Intel Thread Checker– Intel Thread Profiler
• Performance Library– Intel IPP– Intel OpenCV– Intel MKL
• OpenMP, MPI, Pthread…
Course Development
What Why How
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Threading Building Blocks (TBB)
• C++ template library• Parallelism: Higher-level, task-based• Outfitting C++ for multi-core processor parallelism• Initial released : Aug, 2006, current version: 2.1• Mainly focus on: Performance & Scalability• Will be the best solution of Many-core processor
Course Development
What Why How
http://www.threadingbuildingblocks.org/
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
TBB
• Most students are family with C++– Easy to follow
• you specify tasks instead of threads– Library maps your logical tasks onto physical
threads, efficiently using cache and balancing load
• avoid tedious works of creating and managing threads
• avoid inefficient programs
Course Development
What Why How
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
TBB
• Comparison with Raw Threads– Raw threads, e.g. pthread, windows threads
• intended for shared memory parallelism• Lowest level of controlling parallelism• Cost much in programming, debugging,
maintenance.• Difficulty in programming Correctly and efficiently• Load imbalances
Course Development
What Why How
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
TBB
• Comparison with MPI– MPI intended for distributed memory system– Cost much on communication among
processes for synchronization
Course Development
What Why How
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
TBB
• Comparison with OpenMP– Intended for shared memory, First standard: 1997.
Supported by most Compilers– You need to choose one of the scheduling approach for
loop iterations: • Static, Dynamic, Guided
– TBB:• divide-and-conquer scheduling approach, you don’t need to
worry about the scheduling policies.
– Reduction:• OpenMP: Build-in types• TBB: All types
Course Development
What Why How
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
TBB
• Besides:– Scalability
• Tuning grain size• Do well on single-core platform• Especially, Move to many-core processor
Course Development
What Why How
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Key contents of TBBCourse Development
What How
Synchronization primitivesatomic operations
various flavors of mutexes (improved)
Parallel algorithmsparallel_for (improved)
parallel_reduce (improved)parallel_do (new)
pipeline (improved)parallel_sortparallel_scan
Concurrent containersconcurrent_hash_map
concurrent_queueconcurrent_vector
(all improved)
Task schedulerWith new functionality
Memory allocatorstbb_allocator (new), cache_aligned_allocator, scalable_allocator
Utilitiestick_count
tbb_thread (new)
Why
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
TBB– Task Based ApproachCourse Development
What Why How
Work-stealing balances loadLoad imbalance
Programmer specifies tasks, not threadsHigh overhead
Non-preemptive unfair schedulingFair scheduling
One scheduler thread per hardware threadOversubscription
Intel® TBB ApproachProblem
• Intel® TBB provides C++ constructs that allow you to express parallel solutions in terms of task objects – Task scheduler manages thread pool – Task scheduler avoids common performance problems of
programming with threads
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Teaching TBB contents
• Be delivered after OpenMP and MPI contents– The basic thread and threading concept have been
accepted by students• Analysis the program which was constructed by
TBB, with Intel programming tools.• Lab projects:
– Matrix Multiply– Numerical Integration– recursive tasks– Concurrent hash map– Scalable Allocator
• Applications
Course Development
What Why How
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Teaching TBB contents
• Note: TBB is NOT intended for:– I/O bound processing– Real-time processing
• So, you need to tune your program carefully by Intel Programming Tools in term of your applications, such as embedded system design.
• Surely, TBB will gain your heart sooner or later, for your teaching or applications!
Course Development
What Why How
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Resources for curriculum building
• TBB as example• What can we get from Intel?
– software– Technical support– Courseware– Teaching approach & experience– New/Updated technology– Training chance
Resource
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Software
• Mainly, ISC (Intel Software College)– http://softwarecollege.intel.com– http://www.intel.com/cd/software/products/asmo
-na/eng/index.htm– Two columns
• Intel® Academic Community• Training for Developers
• For TBB– http://www.threadingbuildingblocks.org/
Resource
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
SoftwareResource
Registration for more
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Courseware
• University Program– http://softwarecollege.intel.com/university/
Resource
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Courseware
• Access Courseware – Multi Core Courseware content from Intel, include lab
projects, source code, etc.– Multi Core Courseware from Faculty – Other Courseware – Webinars – Videos – ISC 'Wiki' – Featured Events – Papers – Conference Presentations
Resource
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Courseware
• Curriculum websites from global Universities.
• Microsoft and other companies website • Get from Intel engineering directly
Resource
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Support
• Documentations– White paper– Reference manual– Books
• Training chance– Workshop– Webinars– Forums
Resource
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Others
• Intel Blog– Global (in English) and Chinese blog.
• New/updated technology• Teaching approach & experience
• Webinars– excellent training approach, you can discuss
with Intel veteran engineer directly.– Recent topics such as: ( see next slide)
Resource
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Others
• Towards a Curriculum for Parallelism - Design Patterns
• Towards a Curriculum for Parallelism - Using Multithreaded Libraries to Maximize Performance for Digital Media Apps
• Towards a Curriculum for Parallelism - Practical hands on architecture for applications programmers
• Teaching Many-core Computing in an Academic Environment - Design Doc Review
• Towards a Curriculum for Paralleism - Using Intel®Threading Building Blocks
• The March towards Manycores - Towards a Curriculum for Parallelism
Resource
2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.2008 Intel China Multi-core Academic Forum, Xiamen, China, August 28-29, 2008.
Wuhan University
Another Useful Programming Tools
• Parallel Studio– Announced August 20, 2008– Ultimate all-in-one parallelism toolkit. – Intel Parallel Advisor
• Gain insight on where parallelism will benefit existing source code.
– Intel Parallel Composer• Incorporate parallelism quickly with a C/C++ compiler and
comprehensive threaded libraries. – Intel Parallel Inspector
• Ensure application reliability with proactive "bug finder" for all parallel programming models.
– Intel Parallel Amplifier• Easy-to-use performance analyzer finds bottlenecks quickly.