30
1 KIPA Game Engine Seminars Jonathan Blow Seoul, Korea December 12, 2002 Day 15

KIPA Game Engine Seminars

  • Upload
    pia

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

KIPA Game Engine Seminars. Day 15. Jonathan Blow Seoul, Korea December 12, 2002. Bit Tricks. Generating Bit Masks Is some number a power of two? Avoiding ‘if’ statements (branch prediction) Floating-point absolute value Floating-point compare Floating-point log2. Generating Bit Masks. - PowerPoint PPT Presentation

Citation preview

Page 1: KIPA Game Engine Seminars

1

KIPA Game Engine Seminars

Jonathan Blow

Seoul, Korea

December 12, 2002

Day 15

Page 2: KIPA Game Engine Seminars

2

Bit Tricks

• Generating Bit Masks

• Is some number a power of two?

• Avoiding ‘if’ statements (branch prediction)

• Floating-point absolute value

• Floating-point compare

• Floating-point log2

Page 3: KIPA Game Engine Seminars

3

Generating Bit Masks

• Suppose we want to mask the low n bits of a machine word

• We can generate that with a loop

• Show summation equation for the loop

• Identity that lets us do something faster

Page 4: KIPA Game Engine Seminars

4

Is some number a power of two?

• The power-of-two will be a single bit somewhere in the middle of the word

• The power-of-two minus one will be a bit mask like the ones we just looked at

• ANDing them together will produce 0

Page 5: KIPA Game Engine Seminars

5

Counting the numberof set bits in a machine word

• Slow loop version

• “Trick” O(num set bits) version

• Discussion of tree version

Page 6: KIPA Game Engine Seminars

6

Pentium 4 “fireball”

• A 16-bit integer unit at the core of the chip that runs at very high clock speeds

• 32-bit integer operations are pipelined through the fireball as multi-stage 16-bit operations

• Pipeline is organized for bits to flow from bottom to top of the word (as with addition and subtraction)

• Right-shifts require a dependency that goes in the opposite direction (slower!)

Page 7: KIPA Game Engine Seminars

7

“How many bits does it take to store this range of values?”

• Application: network or file i/o

• Want ceil(log2(n_max)) assuming the values go from 0 to n_max

• Slow floating-point versions

• Fast bit-extraction versions

Page 8: KIPA Game Engine Seminars

8

Floating-Point log2

• Show slow version

• Fast version utilizing the IEEE-754 format

Page 9: KIPA Game Engine Seminars

9

Fast absolute value

• Utilizing IEEE-754 floating point format

Page 10: KIPA Game Engine Seminars

10

Fast floating-point compare

• Description of how x86 machines compare floating point numbers– Get at least one of them on the stack– Perform ‘fcomp’ instruction– Load the floating point control word– Bit-mask it to see if the desired field is set

Page 11: KIPA Game Engine Seminars

11

Decision-making without branching

• (And without writing in assembly language, to use instructions like CMOV)

• Build a mask based on whether some intermediate result is negative or not

• Use that to mask values and add them, or whatever you want– Examples

Page 12: KIPA Game Engine Seminars

12

Collision Detection

• Speedbox and Schnitzel as alternatives to the “prevent tunneling” raycast

Page 13: KIPA Game Engine Seminars

13

Collision Detection

• Don’t forget to optimize mainly for the expected case!– To miss a lot, or to hit a lot?

• Example of Shock Force and the “early hit test”– We expect to miss usually!– So the early hit test was not so effective

Page 14: KIPA Game Engine Seminars

14

Collision detection

• More Shock Force examples– Hierarchy of tests: bounding sphere, OBB,

simple plane divide, BSP “hard case”

Page 15: KIPA Game Engine Seminars

15

Profiling• Motivation

– You can’t optimize unless you profile. For some reason some people think they can… they’re wrong.

• Demo of sample app

• Goals:– Know where the overall CPU is being spent

• May depend on which kind of behavior is happening!

– Know which routines are stable and which ones are not

Page 16: KIPA Game Engine Seminars

16

Profiling

• Example of getting the current time on Windows– At different accuracy levels

• Description of how this is slow, and why– Too slow to call very often in code!

Page 17: KIPA Game Engine Seminars

17

Profiling (2)

• Using the rdtsc instruction

• Converting this to realtime units by calling QueryPerformanceCounter once per frame

Page 18: KIPA Game Engine Seminars

18

Profiling (3)

• Define macros that put rdtsc calls into preambles and postambles for functions

• Measure and categorize CPU time this way

• Measure “self time” and “hierarchical time”

• Code review of macros / constructors

Page 19: KIPA Game Engine Seminars

19

Problem with rdtsc

• There’s this SpeedStep thing on Intel laptops– Change the CPU’s clock speed based on

performance / temperature demands– Does not adjust rdtsc to compensate

• May spread beyond laptops in the future– Power consumption of CPUs is becoming an

important concern for businesses

Page 20: KIPA Game Engine Seminars

20

We can detect if rdtsc is screwing up profiling data

• But we can’t fix the profiling data

• Solution: just draw a big warning on the screen

Page 21: KIPA Game Engine Seminars

21

Division of Profiler

• Low-Level Profiler

• High-Level Profiler

Page 22: KIPA Game Engine Seminars

22

Walkthrough of first demo app

• How it uses the macros

• How it collects and draws the profiling data

Page 23: KIPA Game Engine Seminars

23

Measuring varianceof profiling data

• To figure out how stable each function is

• Draw which functions are “hot” in the realtime display

Page 24: KIPA Game Engine Seminars

24

Behaviors

• We would like some better analysis of what the different behaviors are for our program

• Just “eyeing” the results is not very scientific

• Examples of different behaviors– Fill rate limited, AI limited, etc

Page 25: KIPA Game Engine Seminars

25

Batch Profiling vs Interactive Profiling

• Batch profiling averages a bunch of data together over a session– Maybe it provides a way to peek at individual

samples but the processing is never very convenient

• Interactive profiling is about seeing results as soon as they happen– But interactive profilers are usually hacked

together• What if we made a good one?

Page 26: KIPA Game Engine Seminars

26

Want to detect and analyzespecific behaviors

• But without preconceived ideas of what they might be

• Treat incoming frames of profiling data as vectors, and cluster them

• Description of k-means clustering

Page 27: KIPA Game Engine Seminars

27

Clustering algorithms tend tobe pretty slow

• And they require batch data to process– k-means needs random access to the input!

• Online k-means– Faster, non-batch. But quality?

Page 28: KIPA Game Engine Seminars

28

Self-Organizing Map

• “Kohonen Self-Organizing Map”

• Description of the algorithm

• Much like online k-means– But with coherence in a separate space

Page 29: KIPA Game Engine Seminars

29

Demo of SOM-enabledProfiling Tool

• Visualizations are still early

• Hopefully they will mature into something truly useful (people in other visualization fields like SOMs, so hopes are high)

Page 30: KIPA Game Engine Seminars

30

Discussions of changes made to SOM to support online clustering