Hyper Threading technology

Hyper-Threading Technology

Presented By

KARUNAKAR THAKUR

To Enhance Performance-

Increase in clock rate

o Involves reducing clock cycle time

o Can increase the performance by increasing number of instructions finishing per second

o H/w limitations limit this feature

Cache hierarchies

o Having frequently used data on the processor caches reduces average accesses time

Introduction

Pipelining

o Implementation Technique whereby multiple instructions are overlapped in execution

o Limited by the dependencies between instructions

o Effected by stalls and effective CPI is greater than 1

Instruction Level Parallelism o It refers to techniques to increase the number of instructions executed in each clock cycle.

o Exists whenever the machine instructions that make up a program are insensitive to the order in which they are executed if dependencies does not exist, they may be executed.

Thread level parallelism

Chip Multi Processing

o Two processors, each with full set of execution and architectural resources, reside on a single die.

Time Slice Multi Threading

o single processor to execute multiple threads by switching between them

Switch on Event Multi Threading

o switch threads on long latency events such as cache misses

Simultaneous Multi Threading

o Multiple threads can execute on a single processor without switching.

oThe threads execute simultaneously and make much better use of the resources.

oIt maximizes the performance vs. transistor count and power consumption.

Thread level parallelism (cont..)

Hyper-Threading Technology

Hyper-Threading Technology brings the simultaneous multi-threading approach to the Intel architecture.

Hyper-Threading Technology makes a single physical processor appear as two or more logical processors

Hyper-Threading Technology first invented by Intel Corp.

Hyper-Threading Technology provides thread-level-parallelism (TLP) on each processor resulting in increased utilization of processor and execution resources.

Each logical processor maintain one copy of the architecture state

Processor Execution Resources

Processor Execution Resources

Arch State Arch State Arch State

Processor with out Hyper-Threading Technology

Processor with Hyper-Threading Technology

Ref: Intel Technology Journal, Volume 06 Issue 01, February 14, 2002

Hyper-Threading Technology Architecture

Register Alias Tables

Next-Instruction Pointer

Instruction Streaming Buffers and Trace Cache Fill Buffers

Instruction Translation Look-aside Buffer

Following resources are duplicated to support Hyper-Threading Technology

Figure: Intel Xeon processor pipeline

Sharing of Resources

Major Sharing Schemes are-

o Partition

o Threshold

o Full Sharing

Partition

Each logical processor uses half the resources

Simple and low in complexity

Ensures fairness and progress

Good for major pipeline queues

Partitioned Queue Example

• Yellow thread – It is faster thread

• Green thread – It is slower thread

Partitioned Queue Example

• Partitioning resource ensures fairness and

ensures progress for both logical processors.

Threshold

Puts a threshold on number of resource entries a logical processor can use.

Limits maximum resource usage

For small structures where resource utilization in burst and time of utilization is short, uniform and predictable

Eg- Processor Scheduler

Full Sharing

Most flexible mechanism for resource sharing, do not limit the maximum uses for resource usage for a logical processor

Good for large structures in which working set sizes are variable and there is no fear of starvation

Eg: All Processor caches are sharedo Some applications benefit from a shared cache

because they share code and data, minimizing redundant data in the caches

Netburst Microarchitecture’s execution pipeline

• Two modes of operations – single-task (ST) – multi-task (MT).

• MT-mode- There are two active logical processors and some of the resources are partitioned.

• There are two flavors of ST-mode: single-task logical processor 0 (ST0) and single-task logical processor 1 (ST1).

• In ST0- or ST1-mode, only one logical processor is active, and resources that were partitioned in MT-mode are re-combined to give the single active logical processor use of all of the resources

SINGLE-TASK AND MULTI-TASK MODES

SINGLE-TASK AND MULTI-TASKMODES

• HALT instruction that stops processor execution.

• On a processor with Hyper-Threading Technology, executing HALT transition the processor from MT-mode to ST0- or ST1-mode, depending on which logical processor executed the HALT.

• In ST0- or ST1-modes, an interrupt sent to the halted logical processor would cause a transition to MT-mode.

OPERATING SYSTEM

• For best performance, the operating system should implement two optimizations.

– The first is to use the HALT instruction if one logical processor is active and the other is not. HALT will allow the processor to transition MT mode to either the ST0- or ST1-mode.

– The second optimization is in scheduling software threads to logical processors. The operating system should schedule threads to logical processors on different physical processors before scheduling two threads to the same physical processor.

Business Benefits of Hyper-ThreadingTechnology

• Higher transaction rates for e-Businesses

• Improved reaction and response times for end-users and customers.

• Increased number of users that a server system can support

• Handle increased server workloads

• Compatibility with existing server applications and operating systems

Performance increases from Hyper-Threading Technology on an OLTP workload

Web server benchmark performance

Conclusion

•Intel’s Hyper-Threading Technology brings the concept of simultaneous multi-threading to the Intel Architecture.

•It will become increasingly important going forward as it adds a new technique for obtaining additional performance for lower transistor and power costs.

•The goal was to implement the technology at minimum cost while ensuring forward progress on logical processors, even if the other is stalled, and to deliver full performance even when there is only one active logical processor.

References • “HYPER-THREADING TECHNOLOGY

ARCHITECTURE AND MICROARCHITECTURE” by Deborah T. Marr, Frank Binns, David L. Hill, Glenn Hinton,David A. Koufaty, J. Alan Miller, Michael Upton, intel Technology Journal, Volume 06 Issue 01, Published February 14, 2002. Pages: 4 –15.

• “:HYPERTHREADING TECHNOLOGY IN THE NETBURST MICROARCHITECTURE” by David Koufaty,Deborah T. Marr, IEEE Micro, Vol. 23, Issue 2, March–April 2003. Pages: 56 – 65.

• http://cache-www.intel.com/cd/00/00/22/09/220943_220943.pdf

• http://www.cs.washington.edu/research/smt/papers/tlp2ilp.final.pdf

• http://mos.stanford.edu/papers/mj_thesis.pdf

Thank youThank you