View
212
Download
0
Embed Size (px)
Citation preview
Status – Week Status – Week 231231
Victor MoyaVictor Moya
SummarySummary
Primitive AssemblyPrimitive Assembly Clipping triangle rejection.Clipping triangle rejection. Rasterization.Rasterization. Triangle Setup.Triangle Setup. Early Z.Early Z. Current status.Current status.
Primitive AssemblyPrimitive Assembly
Works as a LRU cache.Works as a LRU cache. Asks the Post T&L cache for missing Asks the Post T&L cache for missing
vertex.vertex. Checks if some of the new vertex are Checks if some of the new vertex are
already in the primitive assembly cache.already in the primitive assembly cache. Three vertex stored (2 for triangles, 3 for Three vertex stored (2 for triangles, 3 for
quads).quads). Last vertex is always bypassed directly Last vertex is always bypassed directly
to Triangle Setup.to Triangle Setup.
ClippingClipping
Check clipping per vertex.Check clipping per vertex. Use Cohen & Sutherland method.Use Cohen & Sutherland method. Reject full triangles.Reject full triangles. Do not generate new triangles.Do not generate new triangles.
Cohen & Sutherland Cohen & Sutherland ClippingClipping
Cohen & Sutherland: 6 bit outcode.Cohen & Sutherland: 6 bit outcode. if all three vertex have 000000 if all three vertex have 000000
outcodes outcodes the triangle is inside the fustrum volume.the triangle is inside the fustrum volume.
if the logical and of the three vertex if the logical and of the three vertex outcodes is 0outcodes is 0
the triangle intersects the fustrum volume.the triangle intersects the fustrum volume. if the logical and of the three vertex if the logical and of the three vertex
outcodes is not 0outcodes is not 0 the triangle is outside the fustrum volume.the triangle is outside the fustrum volume.
Sutherland & CohenSutherland & Cohen
0101 0001 1001
0000 10000100
0110 0010 1010
ClipperClipper
StreamerCommit
ClipperPrimitiveAssembly
transformed vertices
transformed vertices
assembled triangles
clipped triangles
ClippingClipping Alternatives:Alternatives:
Full clipping:Full clipping: Sutherland – Hodgeman: Sutherland – Hodgeman:
– Clip the triangle against each clip plane in an iterative Clip the triangle against each clip plane in an iterative way.way.
Other algorithms similar to previous.Other algorithms similar to previous. Generates new triangles.Generates new triangles.
No clipping:No clipping: Guad band and scissoring for left, right, bottom and Guad band and scissoring for left, right, bottom and
top clipping.top clipping. 2DH rasterization (Olano) for w < 0 clipping.2DH rasterization (Olano) for w < 0 clipping. More hardware needed in rasterizer.More hardware needed in rasterizer. Possible fillrate lose.Possible fillrate lose.
RasterizationRasterization
ClipperTriangle
SetupTraversal Interpolation
Rasterizer Emulator
Setup(vattrib[3]) nextFragment() Interpolate(fr)
RasterizationRasterization Traversal algorithm has been reimplemented:Traversal algorithm has been reimplemented:
did not work for small thin triangles :P.did not work for small thin triangles :P. Adjoint matrix (setup matrix) and edge Adjoint matrix (setup matrix) and edge
equation math moved all to 64-bit FPequation math moved all to 64-bit FP there was a precission problem.there was a precission problem.
There is still some problem with edges with 0 There is still some problem with edges with 0 and 1 slope (vertical and horizontal edges).and 1 slope (vertical and horizontal edges).
Added screen ‘scissoring’ to the traversal Added screen ‘scissoring’ to the traversal algorithmalgorithm
Add a bottom bound line traversal ending Add a bottom bound line traversal ending condition.condition.
RasterizationRasterization
Current traversal algorithm:Current traversal algorithm: Select top most vertex.Select top most vertex. Divide by w (project) and map to viewport top Divide by w (project) and map to viewport top
most vertex.most vertex. Start traversal moving left and checking right Start traversal moving left and checking right
and down.and down. Move to left until outside of the triangle.Move to left until outside of the triangle. Move to right if right save available.Move to right if right save available. Move to right until outside of the triangle.Move to right until outside of the triangle. Move to down if down save available.Move to down if down save available. Repeat.Repeat.
RasterizationRasterization
Traversal algorithm:Traversal algorithm: Ending condition: next pixel outside Ending condition: next pixel outside
or below the bottom bound line.or below the bottom bound line. Special case for thin triangles:Special case for thin triangles:
Cross from negative to positive through Cross from negative to positive through the ‘vertical’ edges.the ‘vertical’ edges.
Keep going until the bottom bound line.Keep going until the bottom bound line.
RasterizationRasterization
scanned
generated inside
generated outside
RasterizerRasterizer
Problems:Problems: Algorithm needs start point.Algorithm needs start point. Unclipped triangles.Unclipped triangles. Top most vertex can be:Top most vertex can be:
far outside of the fustrum volumefar outside of the fustrum volume– Many cycles required to travel to inside.Many cycles required to travel to inside.
behind the eye:behind the eye:– triangle is inverted.triangle is inverted.
Top most vertex must be projected and Top most vertex must be projected and mapped to the viewport.mapped to the viewport.
Others?Others?
RasterizerRasterizer
Alternatives:Alternatives: Implement projection (divide by w) and use Implement projection (divide by w) and use
a normal rasterization algorithm (no 2DH).a normal rasterization algorithm (no 2DH). Use iterative hierarchical approaches Use iterative hierarchical approaches
(McCool):(McCool): May spend many cycles until first fragment is May spend many cycles until first fragment is
generated.generated. Find a way of creating a bounding box for Find a way of creating a bounding box for
the triangle without projection, clipping or the triangle without projection, clipping or viewport mapping?viewport mapping?
ProblemsProblems
Current implementation must count Current implementation must count number of passing triangles in each box number of passing triangles in each box to detect end of batch.to detect end of batch.
‘‘Empty’ culled (either in primitive Empty’ culled (either in primitive assembly, clipper or triangle setup) assembly, clipper or triangle setup) triangles must be passed down to triangles must be passed down to FragmentFIFO.FragmentFIFO.
This produces wastes cycles traversal, This produces wastes cycles traversal, interpolation and fragment FIFO.interpolation and fragment FIFO.
Other solutions?Other solutions?
SynchronizationSynchronization
Ready signals vs Request signals.Ready signals vs Request signals. Ready/Busy signals needs at least a buffer (for Ready/Busy signals needs at least a buffer (for
the incoming data) with the size of the signal the incoming data) with the size of the signal latency in the sender.latency in the sender.
This buffer could remain empty most of the This buffer could remain empty most of the time. time.
Request signals require a counter Request signals require a counter (displacement counter?) in the reciever.(displacement counter?) in the reciever.
The counter could be updated more than once The counter could be updated more than once per cycle (mutiple requests, multiple request per cycle (mutiple requests, multiple request serviced). serviced).
SynchronizationSynchronization
data
data
data signal write
ready signal write
data signal read
ready signal read
cycle n + 2
ready
ready
data
data
busy
busy
busy
no buffer available so data is lost
cycle n + 1cycle n
SynchronizationSynchronization
Alternatives:Alternatives: Start latencies of the consumer box Start latencies of the consumer box
are fixed and/or known by the are fixed and/or known by the producer box.producer box.
Next stepsNext steps
Add the clipper box for trivial reject/accept Add the clipper box for trivial reject/accept of triangles against the fustrum volume.of triangles against the fustrum volume.
Add the configuration options and file that Add the configuration options and file that was discussed months ago (signals, was discussed months ago (signals, latencies, etc).latencies, etc).
Add new boxes: early Z, pixel shader, Add new boxes: early Z, pixel shader, texture units, etc.texture units, etc.
Implement a more decent Memory Implement a more decent Memory Controller.Controller.
Early ZEarly Z
Could be implemented before Could be implemented before interpolation.interpolation.
Interpolate the triangle Z (z/w) first.Interpolate the triangle Z (z/w) first. Could save some calculations.Could save some calculations. Would save time?Would save time? Implement hierarchical Z: multilevel z Implement hierarchical Z: multilevel z
buffer internal/external memory.buffer internal/external memory. Must be disabled if pixel shaders write z.Must be disabled if pixel shaders write z.
Pixel ShaderPixel Shader
Add new math instructions to Add new math instructions to Shader emulator.Shader emulator.
Add texture instructions to the Add texture instructions to the Shader emulator.Shader emulator.
Design a megathreaded shader Design a megathreaded shader (maybe up to 100s of fragments on (maybe up to 100s of fragments on fly).fly).
Design communication with the Design communication with the texture unit.texture unit.
Texture UnitTexture Unit
Design texture unit.Design texture unit. Create a texture emulator.Create a texture emulator. Texture storage in memory?Texture storage in memory? Texture compression?Texture compression? Texture cache?Texture cache? Filtering methods:Filtering methods:
Bilinear.Bilinear. Trilinear.Trilinear. Anisotropic.Anisotropic.
Memory ControllerMemory Controller
GDDR3?GDDR3? Need to read the current memory Need to read the current memory
specifications.specifications. Banks, access methods, bus widths Banks, access methods, bus widths
...... Memory buses to the GPU units?Memory buses to the GPU units? Priority, policies?Priority, policies?