Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
Department of Science and Technology Institutionen för teknik och naturvetenskap Linköping University Linköpings universitet
gnipökrroN 47 106 nedewS ,gnipökrroN 47 106-ES
LiU-ITN-TEK-A-14/009-SE
Dynamisk visualisering avrymdvädersimuleringsdata
Victor Sand
2014-05-16
LiU-ITN-TEK-A-14/009-SE
Dynamisk visualisering avrymdvädersimuleringsdata
Examensarbete utfört i Medieteknikvid Tekniska högskolan vid
Linköpings universitet
Victor Sand
Handledare Alexander BockExaminator Anders Ynnerman
Norrköping 2014-05-16
Upphovsrätt
Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare –under en längre tid från publiceringsdatum under förutsättning att inga extra-ordinära omständigheter uppstår.
Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner,skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat förickekommersiell forskning och för undervisning. Överföring av upphovsrättenvid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning avdokumentet kräver upphovsmannens medgivande. För att garantera äktheten,säkerheten och tillgängligheten finns det lösningar av teknisk och administrativart.
Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman iden omfattning som god sed kräver vid användning av dokumentet på ovanbeskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådanform eller i sådant sammanhang som är kränkande för upphovsmannens litteräraeller konstnärliga anseende eller egenart.
För ytterligare information om Linköping University Electronic Press seförlagets hemsida http://www.ep.liu.se/
Copyright
The publishers will keep this document online on the Internet - or its possiblereplacement - for a considerable time from the date of publication barringexceptional circumstances.
The online availability of the document implies a permanent permission foranyone to read, to download, to print out single copies for your own use and touse it unchanged for any non-commercial research and educational purpose.Subsequent transfers of copyright cannot revoke this permission. All other usesof the document are conditional on the consent of the copyright owner. Thepublisher has taken technical and administrative measures to assure authenticity,security and accessibility.
According to intellectual property law the author has the right to bementioned when his/her work is accessed as described above and to be protectedagainst infringement.
For additional information about the Linköping University Electronic Pressand its procedures for publication and for assurance of document integrity,please refer to its WWW home page: http://www.ep.liu.se/
© Victor Sand
Dynamic Visualization
of Space Weather Data
Victor Sand
Civilingenjor Medieteknik
Linkoping University
Master’s thesis
Goddard Space Flight Center, Maryland, USA
Norrkoping, Sweden
June 2014
Abstract
The work described in this thesis is part of the Open Space project, a
collaboration between Linkoping University, the National Aeronautics and
Space Administration and the American Museum of Natural History. The
long-term goal of Open Space is a multi-purpose, open-source scientific vi-
sualization software.
The thesis covers the research and implementation of a pipeline for prepar-
ing and rendering volumetric data. The developed pipeline consists of three
stages: A data formatting stage which takes data from various sources and
prepares it for the rest of the pipeline, a pre-processing stage which builds a
tree structure of of the raw data, and finally an interactive rendering stage
which draws a volume using ray-casting.
Large parts of the system are built around the use of a Time-Space Parti-
tioning tree, originally described by Shen et al. This tree structure uses an
error metric system and an octree-based structure to efficiently choose the
appropriate level of detail during rendering. The data storage and structure
are similar to the one in the GigaVoxels system by Crassin et al. Using a
combination of these concepts and constructing the pipeline around them,
space weather related volumes have been successfully rendered at interactive
rates.
The pipeline is a fully working proof-of-concept for future development of
Open Space, and can be used as-is to render space weather data. Many
concepts and ideas from this work can be utilized in the larger-scale software
project.
iv
Acknowledgements
First of all, I would like to thank my examinator, professor Anders Yn-
nerman, for the fantastic opportunity and for keeping the project running.
Thanks also to my excellent advisor Alexander Bock for many late hours of
support and idea discussions. Your willingness to help and share your vast
graphics knowledge has been truly invaluable.
Thank you Masha for your tireless and dedicated work with CCMC and for
taking care of us thesis students. Your genuine interest in the project is a
requirement for its success! I’m sure the next couple of students will feel
just as welcome. Thank you Carter for keeping us busy and for the great
private tour of the museum. Bob, thank you for keeping an eye on the big
picture!
Aki, thanks for making my commute shorter, my lunches more tasty and
my country music knowledge more solid. Nate, thanks for being a bro and
thanks Avery for letting me sleep on your floor for a while. Come to Sweden
and I’ll repay the favors! Thanks Martin for doing a great job during the
first stage of the project and thereby making my job easier. Thanks to my
many different roomates and friends in Washington D.C. for making my
stay so much more than only work. I hope to see many of you again soon!
Many thanks to Holmen AB, Sparbankstiftelsen Alfa and Stiftelsen Anna
Whitlocks Minnesfond for the financial help when CSN wouldn’t lend me
more money. I could have not completed my stay without it.
Finally, thanks to my family for the endless support and encouragement!
Victor
Stockholm, February 2014
ii
Contents
1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Aim and Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Background 5
2.1 Space Weather . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Community Coordinated Modeling Center . . . . . . . . . . . . . . . . . 6
2.3 Open Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Previous Work 9
3.1 Volume Ray-Casting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 TSP Tree Acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Overview and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Traversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Error Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3 Rendering of Large Voxel Datasets . . . . . . . . . . . . . . . . . . . . . 13
Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4 Pipeline Overview 15
4.1 Pipeline Stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Inputs and Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
iii
CONTENTS
5 TSP Tree Implementation 17
5.1 Bricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.2 Separation of Structure and Data . . . . . . . . . . . . . . . . . . . . . . 17
5.3 Memory Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.4 Error Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.5 Pointer Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.6 Traversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
6 Data Formatting 23
6.1 Space Weather Data Sources . . . . . . . . . . . . . . . . . . . . . . . . 23
ENLIL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
CDF Data Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Kameleon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.2 Furnace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.3 Voxel Data Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
7 Data Pre-Processing 27
7.1 Forge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7.2 TSP Tree Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Brick Padding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Octree Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
BST Assembling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7.3 TSP Data Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
8 Rendering 31
8.1 Flare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8.2 TSP Structure and Error Metrics Construction . . . . . . . . . . . . . . 32
TSP Structure Construction . . . . . . . . . . . . . . . . . . . . . . . . . 32
Error Metrics Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Error Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
8.3 Intra-Frame Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
View Ray Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
TSP Tree Probing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Brick Uploading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
iv
CONTENTS
Ray-Casting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
8.4 Asynchronous Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
8.5 Rendering Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
8.6 Cluster Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
SGCT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
9 Results 39
9.1 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
9.2 Rendering Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
9.3 Error Metrics Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . 39
9.4 Visual Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Desktop Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Dome Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
10 Discussion and Future Work 45
10.1 Visual Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
10.2 Interactivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
10.3 Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
10.4 TSP Tree Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
10.5 Data Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
10.6 Error Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
10.7 Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
11 Conclusions 51
References 53
v
CONTENTS
A Code Samples 55
A.1 TSP Tree Traversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
A.2 Brick Padding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
A.3 Octree Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
A.4 BST Assembling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
A.5 TSP Tree Structure Construction . . . . . . . . . . . . . . . . . . . . . . 69
A.6 Error Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
A.7 Rendering Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
vi
1
Introduction
This first chapter briefly discusses the background and goals of the work. It also
describes used methods as well as the thesis’s structure and limitations. Note that the
background is described in more detail in chapter 2.
1.1 Background
Open Space is the working title for a project initiated in the fall of 2012. Collaborators
in the project are Linkoping University, the Community Coordinated Modeling Cen-
ter (CCMC) at the National Aeronautics and Space Administration (NASA) and the
American Museum of Natural History (AMNH). The long-term goal of Open Space is
an open-source scientific visualization software with focus on space-related data sources.
This software will be capable of producing efficient, accurate and beautiful visualiza-
tions of phenomena on a scale ranging from the size of atoms to the size of the entire
known universe. The uses for this software will be both scientific as well as for public
dissemination.
In order to accomplish this goal, the participants are engaged in a Master’s thesis
student project. This collaboration enables students from Linkoping to be on-site at
NASA Goddard Space Flight Center, working close to the NASA scientists and the data
sources. Input and feature requests for the project come from all three stakeholders,
giving the project a broad purpose that is rooted in computer graphics research, in
space science and in multimedia.
1
1. INTRODUCTION
The part of the Open Space project described in this thesis aims to efficiently render
time-varying data sets of space weather using volumetric voxel rendering.
1.2 Aim and Goals
One of the most challenging aspects of volumetric rendering is handling large data
sets efficiently. Time-varying data sets provide additional challenges due to memory
limitations and the need to update the rendering often in order to achieve an animation
with an acceptable frame rate.
The aim of the work presented within this thesis is to implement an efficient volu-
metric rendering pipeline, capable of handling large time-varying data sets. The results
of this work will later be implemented in the larger-scale Open Space project. Addi-
tionally, the implemented rendering system will have enough functionality to provide
visualizations that can be used in presentations, videos et cetera.
1.3 Method
The thesis work will be carried out by implementing a volumetric rendering system from
the ground up. The input to this rendering system will be data from space weather
simulations. The system will be continuously improved as the work develops. Having
a basic functionality working early enables an iterative approach, and makes modular
implementation and testing easier as more advanced features are implemented.
1.4 Limitations
Since the thesis focuses on the rendering efficiency and the pipeline, less focus will be
put on the space weather application domain. Although the software will be capable of
rendering arbitrary volumetric data provided the right preprocessing steps are taken,
a smaller amount of time is spent on the applications than in the previous prototyping
phase (see section 2.3).
For the same reasons, the rendering techniques are very simple compared to what
is possible today.
2
1.5 Thesis Structure
1.5 Thesis Structure
To properly familiarize the reader with the subject and the project that this thesis
work is a part of, the thesis will start with a brief section on space weather and some
of the background of the collaboration. Then some previous work on Open Space and
computer graphics will be presented, before describing the implemented pipeline. The
chapter “Pipeline Overview” does not go into any implementation details, but is very
useful for putting the subsequent chapters in context. After the high-level overview
some time is spent on describing the implementation one of the main techniques, the
Time-Space Partitioning (TSP) tree. These methods are used in many parts of the
pipeline, and are therefore also presented early in the thesis. Following the introductory
chapters are the three chapters that each describe a different part of the pipeline.
Results, future work topics and discussion of the work are presented last in the main
part of the thesis. To further explain some of the implementation thesis, an appendix
with selected code samples is included in the back.
3
1. INTRODUCTION
4
2
Background
This chapter will provide context to the thesis by outlining the Open Space project,
and breifly discussing what space weather is and how it is being studied.
2.1 Space Weather
The National Research Council explains the concept of space weather in the following
way (1).
“Space weather” describes the conditions in space that affect Earth and
its technological systems. Our space weather is a consequence of the behav-
ior of the sun, the nature or Earth’s magnetic field and atmosphere, and
our location in the solar system.
The National Space Weather Program Council has a similar description of the
subject and also mentions the effects that space weather can have on earth (2):
“Space weather” refers to conditions on the sun and in the solar wind,
magnetosphere, ionosphere, and thermosphere that can influence the perfor-
mance and reliability of space-born and ground-based technological system
and can endanger human life or health. Adverse conditions in the space
environment can cause disruptions of satellite operations, communications,
navigations, and electric power distribution grids, leading to a variety of
socioeconomic losses.
5
2. BACKGROUND
2.2 Community Coordinated Modeling Center
Given the possible effects on earth, it is desirable to study and predict space weather
events. The Community Coordinated Modeling Center (CCMC) at NASA Goddard
Space Flight Center works with space weather simulation and forecasting. The center
also provides the scientific community access to the models and resources for develop-
ment and research.
2.3 Open Space
The prototyping phase of the Open Space project resulted in a thesis by Tornros (3).
This work contains a thorough summary of the modeling and simulations tools used
at CCMC, as well as an overview of pre-existing visualization software. The thesis
also presents an approach for visualizing space weather data by means of volumetric
rendering and ray-casting. An open-source software for interactive volume rendering,
Voreen (4), is used and extended to produce interactive renderings of space weather
events. These renderings are done for one time step at a time. Screenshots from
Tornros’ thesis can be found in figures 2.1 and 2.2.
The results of this prototyping phase provide an entry point for the work described
in this thesis, where the goal is to enable working with time-varying data sets.
6
2.3 Open Space
Figure 2.1: Screenshot of a coronal mass ejection event visualization
Figure 2.2: Screenshot of the Voreen workspace
7
2. BACKGROUND
8
3
Previous Work
The Previous Work section of the report provides a theoretical background of the
techniques and concepts used in the implementation, mainly related to volumetric
visualization and rendering of large voxel datasets.
3.1 Volume Ray-Casting
Volume ray-casting is an image order volume rendering technique. This means that
the image is produced by iterating over pixels rather than iterating over objects in the
scene. To determine the color of each pixel, view rays are sent from the position of
the camera through the volume (figure 3.1), and the volume is sampled at points along
these rays.
As the samples along each ray are gathered, each intensity is mapped to an RGBA
color using a transfer function (figure 3.2). The colors from the transfer function
mappings are composited into the final ray color using front-to-back compositing. The
equations to calculate the composited color and opacity C ′ and A′ given the accumulated
values and the mapped color and opacity C and A are given in equation 3.1.
C ′
i = C ′
i−1 + (1−A′
i)Ci
A′
i = A′
i−1 + (1−A′
i)Ai
(3.1)
9
3. PREVIOUS WORK
Figure 3.1: The concept of volume ray-casting. View rays are shot from a virtual eye/-
camera position through the image plane.
Figure 3.2: Top: The volume is sampled along the view ray. Bottom: Each sampled
intensity is mapped to a color using a transfer function.
10
3.2 TSP Tree Acceleration
3.2 TSP Tree Acceleration
While straight-forward rendering techniques can be adequate for small data sets, they
are often not efficient enough for large amounts of data with high requirements on
speed. Researchers in the field of volumetric rendering and 3D graphics in general con-
tinuously strive to improve efficiency in data handling by the use of various acceleration
structures.
One such structure is called a Time-Space Partitioning (TSP) tree, and is the chosen
data structure for this work. Implementation details are described in chapter 5.
Overview and Motivation
The TSP tree was first introduced by Shen et. al (5) and was later improved by
Ellsworth et al (6). It is designed to capture and exploit both temporal and spatial
coherency in a time-varying data set. The tree traversal algorithm uses user-supplied
error tolerances to choose the correct level of detail at runtime. By separating the
time domain from the spatial domain and treating them differently, the scheme can
efficiently handle data sets where there is a large discrepancy between the resolutions
in those domains. Error metrics are stored in the tree nodes, and the tree can be built
once and then used repeatedly.
Structure
A TSP tree uses a complete octree as a skeleton. This octree subdivides the volume
until a certain spatial subdivision level has been reached. Each octree node (inner
nodes as well as leaves), in turn contains a binary search tree (BST) that contains the
temporal information for that spatial subdivision. The binary search tree leaves are the
individual time steps, and each level above the leaves represents a time span of twice
the length. The binary search tree roots represent the whole temporal extent. In other
words, the search tree roots represent averages of the octree nodes’ values over all time
steps. The overall structure is illustrated in figure 3.3.
Traversal
TSP tree traversal starts in the octree. For every octree node, the corresponding BST
is traversed top-down until a node with satisfying error metrics is found or a leaf
11
3. PREVIOUS WORK
Figure 3.3: TSP structure (illustrated using a quadtree). The example uses two spatial
subdivisions and eight time steps. The top section represents the octree skeleton, and the
bottom tree is the binary search tree for one of the octree nodes.
12
3.3 Rendering of Large Voxel Datasets
is reached. If the error at the leaf is too big, the traversal continues with the next
subdivision level in the octree. See section 5.6 for the full TSP traversal algorithm.
Error Metrics
The concept of error metrics is key in the use of TSP tree techniques. To separate the
spatial and temporal domains, two different error metrics are used by the TSP tree
algorithm. The spatial error indicates how coherent the voxels within a subvolume are,
and the temporal error is a measure of how coherent the voxels between two or several
time steps are. The first TSP tree publication (5) uses error metric based on the scalar
values of voxels. To make the error metrics more accurate and closely related to the
visible image, a color-based approach was introduced (6). The color-based approach is
useful for any image where mapping from scalar values to colors are used, for example
when using transfer functions.
3.3 Rendering of Large Voxel Datasets
One of the more prominent works in the field of voxel graphics is the GigaVoxels
system, presented in the Ph.D. thesis by Crassin et. al (7). Their work outlines an
extensive pipeline for handling very large sets of data. While the thesis extensively
covers the subject of turning traditional scene to voxels (in contrast of working directly
with voxels), a sophisticated pipeline for processing the data has been developed. This
pipeline mainly deals with transferring data between system and video memory through
a custom GPU paging system.
Data Structure
GigaVoxels makes use of a spatial, octree-based structure for hierarchical space sub-
division. The smallest entities, the octree nodes, in this subdivision are called bricks,
small voxel grids that represent the volume’s subdivision at a given level. Bricks make
it possible to combine efficient 3D texture features of GPUs and give good flexibility
in the subdivision.
Furthermore, GigaVoxels uses brick pointers. These pointers are separated from the
data itself, only pointing to the original data in a brick pool. The hierarchical structure
13
3. PREVIOUS WORK
Figure 3.4: Screenshot from the GigaVoxels system. Image from
http://gigavoxels.inrialpes.fr/
can be easily represented by using these brick pointers rather than the full data, making
the traversal of the structure much faster.
Rendering
The rendering algorithm in GigaVoxels is split into two passes, both done by one big
GPU kernel. One pass traverses the octree top-down, and the other samples the volume.
The level of detail (subdivision level in the octree) is chosen during the traversal step
and is based on the projected size of the voxels on the screen.
The system is built for large datasets where the whole scene is far too big for video
memory. A GPU paging system loads data on-the-fly, getting requests by an ongoing
ray sampling pass. A caching system keeps track of the recently used bricks, making
room for new bricks in GPU memory when needed.
14
4
Pipeline Overview
The implemented pipeline consists of three main stages called Furnace (data format-
ting), Forge (data pre-processing) and Flare (rendering). These three stages are de-
scribed in detail in chapters 6, 7 and 8. This chapter provides a high-level overview of
the whole pipeline.
4.1 Pipeline Stages
Figure 4.1: Overview of the three pipeline stages Furnace, Forge and Flare
The three parts of the pipeline do not interact with each other during runtime, and
each stage is run separately. The separation and encapsulation provide efficient and
customizable processing of the data, in which each phase refines and prepares the data
15
4. PIPELINE OVERVIEW
for rendering.
Furnace funnels various external data sources into a format that the rest of the
pipeline can use. Forge takes the output from Furnace and builds the tree structures
that Flare then uses during rendering, which is the final step.
4.2 Inputs and Outputs
To achieve encapsulation between the parts of the application, the preparation stages
Furnace and Forge output binary files on disk. This means that as soon as a previous
step is completed, the next step only needs the produced file and its structure.
The input to the first step, Furnace, is the volume data to render. This data could
be formatted and delivered in any way, so a Furnace module for each data source needs
to be written. Furnace then outputs a file where the voxel data for all time steps is
saved. Forge uses the straight-forward time-varying volume data and builds the TSP
tree structure and saves it to a new, separate file. The same input data can be used
to produce different configurations of the tree. Flare uses one of the TSP tree files and
renders it.
16
5
TSP Tree Implementation
Before going into the details of the pipeline stages, it is useful to know how the TSP
tree is implemented and how its structure affects different approaches and techniques
throughout the software. This chapter describes some implementation details in the
TSP tree usage.
5.1 Bricks
A tree structure where the leaves are individual voxels would induce a very large over-
head. For this reason, the smallest element in the tree is a brick. Bricks are subvolumes
of voxels. As an example, a volume of 128× 128× 128 voxels using 16× 16× 16 bricks
would have room for 128/16 = 8 bricks per axis. Brick usage is a common concept and
is used by both the TSP (5, 6) and the GigaVoxels (7) authors.
5.2 Separation of Structure and Data
The use of bricks keeps the tree compact enough to store the whole structure in GPU
memory during rendering, since the individual voxels are not referenced. The tree
structure is separated from the raw voxel data. The tree only keeps track of the brick
numbers that correspond to the bricks saved on disk. The nodes in the structure store
the brick number and the number of the node’s child along with error metrics. This
approach is similar to the one described by Crassin et. al (7).
17
5. TSP TREE IMPLEMENTATION
5.3 Memory Layout
The TSP tree was originally described as an “octree of binary search trees”, meaning
that each node an octree skeleton contains a binary search tree each, whose nodes in
turn contain the data. The implementation in this thesis uses that structure (described
in chapter 3) during traversal, but the brick numbering and data ordering that get
saved to disk are slightly different. To enable efficient sequential loading of bricks
during rendering (chapter 8) the data is instead ordered so that the nodes of the
octrees are saved next to each other. This leads to a different pointer structure, and a
structure that can be viewed as a “binary search tree of octrees”. The brick structure
is illustrated in figures 5.1 and 5.2.
Figure 5.1: Conceptual tree layout, differing from the structure originally described. The
layout can be thought of as a binary search tree of octree. The figure uses a quadtree for
illustration.
The main benefit of this approach is that bricks among the same BST and octree
levels will have consecutive brick numbers. As will be discussed in chapter 10, the
rotating nature of a Coronal Mass Ejection (CME) event simulation makes spatial
filtering more useful than temporal filtering in the current implementation. This means
that during runs, bricks will often be chosen from the same node in the BST tree.
Additionally, it is probable that bricks close to each other in the octree will be used
simultaneously. By storing the bricks in one temporal filtering step together, the two-
stage rendering scheme (see chapter 8) can first identify a sequence of bricks and the
uploading stage can use a single read operation to read these from disk, rather than
having to fetch these bricks from different parts of disk memory.
18
5.4 Error Metrics
Figure 5.2: Memory layout of the TSP tree, using quadtrees instead of octrees. Numbers
correspond to brick indices.
5.4 Error Metrics
The original TSP tree paper by Shen et. al (5) uses the coefficient of variation to
indicate the error for a brick. This coefficient is defined as the standard deviation σ
over the average µ. Shen et. al implement this by first calculating the average voxel
value for each brick, calculating the standard deviation in the same brick, and finally
producing the said ratio.
This approach seems to have several problems. It is desirable to get a higher error
the further from the original data (the leaves) in an octree we get, but the standard
deviation gets lower and lower the more averaged (filtered) the higher-up bricks get.
Additionally, averaging with the mean produces very large and varying values when
the average gets close to zero. In large and empty areas of the volume the error should
be very low, but dividing with small values is very unstable. For said reasons, the error
metric calculation has been slightly modified.
The spatial error is calculated by first calculating the average vbrick for each brick,
where n is the number of voxels in a brick (equation 5.1).
vbrick =1
n
n−1∑
i=0
vi (5.1)
Instead of the standard deviation, a modified version is calculated. This is done by
comparing the brick average with the voxel values in the leaf bricks that are covered by
this particular brick. The equation for the modified spatial error (5.2) is similar to a
19
5. TSP TREE IMPLEMENTATION
regular standard deviation calculation. m stands for number of covered bricks and n
stands for the number of voxels per brick.
espatial =
√
√
√
√
1
m · n
m−1∑
j=0
n−1∑
i=0
(vi,j − vbrick)2 (5.2)
The temporal error metric is calculated by first calculating the average value over
time for each voxel (at positions i). See equation 5.3 where l is the number of time
steps.
vi =1
l
l−1∑
t=0
vi,t (5.3)
These values are then used to calculate the average modified standard deviation
per voxel. Subsequently the voxel standard deviations are averaged per brick. This
average is the temporal error metric, illustrated in equation 5.4, where n is the number
of voxels per brick and m is the number of covered leaf bricks. Implementation details
can be found in section 8.2.
etemporal =1
n
n−1∑
k=0
√
√
√
√
1
m
m−1∑
j=0
(vi − vi,j)2 (5.4)
5.5 Pointer Structure
To save space, each tree node stores only one child pointer. The pointer can have
different meaning to enable traversal of both the octrees and the overall binary search
tree. If the binary search tree is the root, the child pointer is used to access the
octree. Otherwise, the child pointer points to the BST child node. The pointer usage
is implemented in the traversal scheme.
5.6 Traversal
The rendering algorithm (see section 8) uses two separate TSP tree traversal passes.
Both passes traverse the tree structure in the same way. For traversing the TSP tree,
the high-level approach suggested by Shen et. al (5) is used. A flowchart for the overall
TSP tree traversal can be found in figure 5.3.
20
5.6 Traversal
Figure 5.3: Flowchart of the TSP tree traversal algorithm. OT - Octree, BST - Binary
search tree.
21
5. TSP TREE IMPLEMENTATION
For the internal octree traversal, a modified version of the KD-restart algorithm by
Horn et. al (8), modified for octrees, is used. This algorithm is stackless, which is very
useful when traversing a structure on a GPU with limited memory and stack depth
capabilities.
The hybrid tree traversal implementation can be found in the code samples, section
A.1.
22
6
Data Formatting
The first pipeline stage, Furnace, extracts volumetric data from various data sources
and produces a format that the subsequent stages use. This format is designed to be
very general and its main purpose is to act as an abstraction layer, leaving optimizations
to procedures later in the pipeline.
6.1 Space Weather Data Sources
While the Open Space software will be capable of handling a large variety of data
sources, space weather related data sources are used throughout the thesis work. This
benefits CCMC, and is a natural continuation of the first stage of the project.
ENLIL
The main data source for this project is the ENLIL model by Xie et. al (9). The model
describes the heliosphere in terms of plasma mass, momentum and energy, among other
variables. ENLIL is used to describe Coronal Mass Ejection (CME) events.
Simulations using this model can be accessed from the CCMC web site (10), and
this project has mainly used a run titled Hong Xie 120312 SH 1 during development.
CDF Data Format
The CCMC space weather event data from simulations is stored in the standardized
CDF (Common Data Format) file format. CDF files store the large number of variables
that the simulations run generate as well as additional metadata.
23
6. DATA FORMATTING
Kameleon
The tool Kameleon (11), developed and maintained at CCMC, is made to extract data
from the CDF files. The software acts as an abstraction layer between the model
data and applications. Kameleon provides access as well as interpolation, allowing
applications to extract spatial and temporal data at arbitrary points.
6.2 Furnace
The very first step in the pipeline is formatting the data. Furnace takes care of this task
using custom modules for the different data sources. These sources include ENLIL data
in the CDF format, and the module to handle ENLIL uses the Kameleon to extract the
chosen data. Furnace is configured using a few basic parameters: The location of the
input and output, type of data source and desired dimensions of the output volume.
Figure 6.1: Schematic overview of the data formatting stage Furnace
6.3 Voxel Data Format
The output from Furnace is called Voxel Data Format (VDF). The volume data is
represented by floats and is ordered by time steps. The voxels in each frame are ordered
by indices, given by equation 6.1, where xDim, yDim and zDim are the number of voxels
along each axis in the volume.
ix,y,z = x+ y · xDim+ z · yDim · zDim (6.1)
24
6.3 Voxel Data Format
The data is stored in a binary file along with some header data. The header data
describes the type of coordinates (currently Cartesian or spherical), the dimensions and
the number of time steps. The VDF file format is described in table 6.1.
Data field Representation
Grid type unsigned integer
Number of time steps unsigned integer
x dimension unsigned integer
y dimension unsigned integer
z dimension unsigned integer
Voxel data float
Table 6.1: VDF data format
25
6. DATA FORMATTING
26
7
Data Pre-Processing
The data needs to be re-formatted from the straight-forward structure into a TSP tree.
The second pipeline stage, Forge, takes care of this process and outputs files which the
rendering stage can use directly.
7.1 Forge
The input is a file in the VDF format described in chapter 6. Forge can be customized to
output TSP trees with different brick sizes. The desired brick size is the only parameter
to Forge, apart from input and output file names.
Figure 7.1: Overview of Forge
27
7. DATA PRE-PROCESSING
7.2 TSP Tree Construction
The TSP tree construction process consists of several stages. The stages use temporary
files as storage where possible to avoid problems when building trees which are too large
to keep in memory at once.
Brick Padding
Bricks will eventually be stored in a 3D texture before getting rendered (see section
8.3 for details). This texture, the brick atlas, does not keep the spatial information
intact. The bricks may be uploaded in any order, and therefore a standard 3D texture
interpolation will fail when sampling close to brick borders. To handle this, each brick
gets padded with a layer of voxels from its spatial neighbors before getting put in the
tree structure. The padding is carried out in two steps, illustrated in figure 7.2:
1. Add an extra layer of voxels around the whole volume, copying the closest border
voxel. This provides neighbors for the bricks on the border.
2. Treat each brick in isolation, and add an extra layer of voxels around each of the
bricks. The added voxels are copies of the neighboring voxels in the volume. Note
that this step is only done for the original voxels, not the extra layer we added in
the previous step.
Figure 7.2: Example of (2D) padding. Original volume is 4 × 4 with 2 × 2 bricks. The
resulting volume (right) is 8 × 8 with 4 × 4 padded bricks, getting the padding from the
neighboring pixels or the added outer border.
28
7.2 TSP Tree Construction
The sampling scheme makes sure that samples are always taken inside or on the
border of the original bricks. This ensures that correct interpolation can be done, since
the neighboring voxels will be from the original data.
The downside with the approach is the added amount of voxels. In figure 7.1 are
some typical brick sizes, and a comparison between the unpadded and the padded voxel
count. As can be seen, padding with an extra layer of voxels results in a significant
voxel increase for a 256×256×256 volume with 256 timesteps. For the 8×8×8 bricks
case, padding means almost a doubling of the volume size. For larger brick sizes, the
added overhead is smaller. Sample code for the brick padding can be found in section
A.2.
Brick size Unpadded voxel count Volume size with padding Padded voxel count
8× 8× 8 9.797.856.768 320× 320× 320 19.136.439.000
16× 16× 16 9.797.595.136 288× 288× 288 13.950.091.512
32× 32× 32 9.795.502.080 272× 272× 272 11.749.341.240
Table 7.1: Comparison of unpadded and padded voxel counts for a 256×256×256 volume
with 256 timesteps
Octree Construction
The first step in building the TSP tree is building one full octree from each time step
in the input data. The octree construction is done by first rearranging the data into
bricks of the chosen size and padding them (see previous section). These bricks are
then given a new index using Z ordering (12). The Z-order (or Morton order) number
arranges the bricks so that the nodes that will make up the children of a higher level
are ordered next to each other. The octree is then built from the bottom up, averaging
the bricks in groups of eight to build the parent nodes of the higher levels. The octree
construction process is illustrated in figure 7.3.
Each octree is saved to a separate file on disk, avoiding limitations on memory.
Sample code for the octree construction can be found in section A.3.
BST Assembling
When the octrees are built, they get assembled into the “binary search tree of octrees”
described in chapter 5. This process is relatively simple. First, the leaf level of the BST
29
7. DATA PRE-PROCESSING
Figure 7.3: Octrees are built by first giving the bricks a new number, and then averaging
nodes to build higher levels until the root is constructed.
(corresponding to individual time steps) is constructed by using the individual octrees
as leaves. Then, the higher levels are built by averaging the two octrees blow, so that
each higher step represents a time span of twice the length of the spans on the lower
level. This process is repeated until the root BST node has been constructed.
Sample code for the BST assembling can be found in section A.4.
7.3 TSP Data Format
The output from Forge, TSP files, adds a few header fields to the format inherited from
Furnace. The additional values concern brick dimensions. The TSP format is described
in table 7.2.
Data field Representation
Grid type unsigned integer
Number of time steps unsigned integer
x dimension unsigned integer
y dimension unsigned integer
z dimension unsigned integer
x brick dimension unsigned integer
y brick dimension unsigned integer
z brick dimension unsigned integer
Number of bricks along x unsigned integer
Number of bricks along y unsigned integer
Number of bricks along z unsigned integer
Voxel data float
Table 7.2: TSP data format
30
8
Rendering
The final piece of the pipeline is the rendering. The software producing the final
renderings is called Flare, and represents the largest of the three stages.
8.1 Flare
Flare renders TSP files from the pre-processing step. The renderings are customized by
choosing a number of parameters and a transfer function, both topics described later
in this chapter.
Figure 8.1: Overview of Flare.
31
8. RENDERING
8.2 TSP Structure and Error Metrics Construction
The overall details of the TSP tree structure and the error metrics can be found in
chapter 5. This section further describes some of the implementation techniques and
refers to source code samples.
TSP Structure Construction
The TSP structure used for traversal is kept in memory, both on the host and the GPU.
It is not explicitly stored on disk, so it needs to be constructed before the rendering
loop can be initiated. The construction is relatively quick and traverses the whole
brick structure on disk once, keeping track of child indices and allocating space for
error metrics (to be calculated in the next steps). Code for the construction function
can be found in section A.5.
Error Metrics Calculation
The spatial error calculation runs in two passes. The first pass calculates the average
color for each brick, and the second pass compares the brick average to the leaves that
the current brick covers.
The temporal error calculation also uses several passes. The first pass is run to keep
track of each voxel’s average value over time. Then the modified standard deviation is
calculated per voxel and then averaged over bricks.
The error calculation can be omitted. If the user wants no errors, the calculation
step is skipped and traversal will always reach the leaves.
Equations for error calculation can be found in section 5.4, and code samples in
section A.6 of the appendix.
Error Caching
Since the current implementation of the error metrics only depends on the voxel data,
the calculated error for a TSP file can be reused. Flare saves the error metrics to a file
which is read before subsequent renderings. The file is small and reading is fast. The
simple caching approach enables pre-calculation of the error metrics. It is separated
from Forge since the user may want to use different kinds of error metrics. In particular,
color-based error metrics that depend on the current transfer function rather than the
32
8.3 Intra-Frame Pipeline
raw intensity values (6). Such error metrics need to be re-calculated at every change
of transfer function, and the mechanism therefore belongs in Flare.
8.3 Intra-Frame Pipeline
The rendering algorithm is two-staged. The data needs to be fetched from disk and
uploaded to GPU memory, and a TSP tree probing step is responsible for requesting
the right bricks to upload. When the bricks are uploaded, the ray-casting step renders
the images. This section describes the intra-frame steps taken in detail.
View Ray Generation
When the model, view and projection matrices are updated, it is time to calculate the
direction of the view rays. The algorithm used for this was proposed by Kruger and
Westermann (13) and relies on rendering a colored cube. The volume to be rendered is
bounded by a cube with its opposing corners in (0, 0, 0) and (1, 1, 1), respectively (see
figure 8.2). The cube is colored by letting each corner vertex also represent a color.
The aforementioned vertices therefore represent a black and a white corner.
Figure 8.2: Bounding cube vertices
A simple GLSL shader interpolates the corner colors across the faces of the cube,
resulting in a fully colored cube where the color in a point on the surface also represent
the point’s position in space. By rendering this colored bounding cube twice, one time
with back face culling and one time with front face culling (figure 8.3), the view rays
33
8. RENDERING
can be calculated. Given coordinates on the view plane, the direction of a view ray is
calculated by taking the difference between the the color of the front facing point and
the color of the back facing point (figure 8.4).
Figure 8.3: Bounding cubes with interpolated colors. Left: Front faces. Right: Back
faces.
Figure 8.4: Example of entry and exit point samples and resulting ray direction.
TSP Tree Probing
The data resides on a file on disk and needs to be uploaded to the GPU memory before
the rendering can take place. To consolidate the uploading of all bricks into one single
request and thereby minimize transfer overhead, a probing step is used. The TSP tree
probing is a dry-run of the rendering, where the result is a brick request list rather
than a rendered image. This probing uses the same view rays and the same traversal
34
8.3 Intra-Frame Pipeline
algorithm as the subsequent rendering. Rays are shot through the volume, and every
time a brick with acceptable error metrics (or a leaf) is found, the responsible OpenCL
kernel increases a value in an array where the indices correspond to brick indices. After
the probing, the bricks that will be needed have a count that is higher than zero. This
process is illustrated in figure 8.5
Figure 8.5: Brick counts before and after a tree probing step. Initially, all counts are
set to zero. After the probing step, bricks which will be used during rendering will have a
count higher than zero. The example uses only two view rays for simplicity and does not
show the tree traversal process.
Brick Uploading
The brick request list generated by the probing step is scanned, and every brick that
has a count higher than zero is put into a brick list. While this brick list is built, every
added brick also gets a coordinate in the brick atlas, the 3D texture that will hold the
uploaded bricks. This coordinate is saved in the brick list and thereby maps every
brick to a unique atlas lookup position. Note that the atlas coordinates do not have
any spatial meaning, it is only a way to keep track of where the rendering kernel will
fetch the data.
The data upload occurs in two steps. The data from disk is uploaded to an OpenGL
Pixel Buffer Object (PBO) that is mapped to system memory. The PBO corresponds
35
8. RENDERING
to the 3D texture that will store the atlas. The uploading is done by scanning the brick
list and placing each brick in the correct spot in the PBO. If the uploading algorithm
detects several consecutive bricks to be uploaded, those are read together to avoid disk
read overhead.
When the PBO is populated with the brick data, an OpenGL 3D texture is built
by copying the pixels from the PBO. The 3D texture is then ready to use by the GPU
rendering kernel. See figure 8.6 for a schematic overview of the brick upload process.
Figure 8.6: Brick uploading. The bricks in the brick list (top) are read from disk, copied
to the right position in the PBO in memory and then uploaded to the 3D texture on the
GPU.
Ray-Casting
The rendering kernel can be launched as soon as the 3D texture is ready. The rendering
process traverses the TSP tree in the same way as during probing (see section 3.2) and
with the same view parameters, thus visiting the same bricks. The sample position
(converted to spherical coordinates if needed) gets translated to the correct texture
atlas coordinates and samples the brick in that atlas cell. The samples are composited
in the manner described in section 3.1 to render the final color for the sampled view
plane coordinate.
36
8.4 Asynchronous Execution
8.4 Asynchronous Execution
The rendering of each individual frame must follow the logical order presented above,
but the pipeline’s bandwidth can be used in a more efficient way by interleaving the
rendering steps during two rendering iterations. Since different tasks in the pipeline
are handled by different parts of the system, parallel execution is important for perfor-
mance. In a simple model, some tasks are handled by the CPU and others by the GPU.
Figure 8.7 describes the order of which tasks are processed. Note that the figure does
not show the correct relation between executions times. See section 9.2 for measured
times, and section A.7 in the appendix for the complete rendering loop code.
Figure 8.7: Simplified overview of the interleaving of rendering steps. Each color repre-
sents a different time step. The figure shows two rendering iterations, where the frame at
t=2 (in green) gets completely processed at the same time as the neighboring frames get
finalized or initiated.
8.5 Rendering Parameters
The rendering application can be configured using a few different parameters. Below
is a list of these parameters and their meaning.
Local OpenCL work size (x and y) Changes the local work size for the OpenCL
probing and ray-casting kernels. Can be used to tune performance. See NVIDIA’s
OpenCL Best Practices Guide (14) for performance heuristics.
Texture division factor A higher factor than 1 decreases the output texture size,
thereby lowering the number of calculated view rays. This factor can be used to
easily give up quality to gain speed.
Spatial error tolerance Maximum tolerable spatial error.
Temporal error tolerance Maximum tolerable temporal error.
37
8. RENDERING
Error calculation (on/off) If turned off, the error calculation step is skipped and
tolerances are set to zero.
Probing step size Step size in the probing kernel.
Ray-casting step size Step size in the tay-casting kernel.
Ray-casting intensity A factor that the final colors get multiplied with. Used adjust
image brightness.
8.6 Cluster Rendering
Large dome theater displays often use a cluster of rendering computers and projectors
to be able to render on very large and curved screens. Such a cluster needs to be able
to divide the rendering work between its nodes, in such a way that each node renders
a portion of the scene without visible seams and artifacts.
SGCT
Simple Graphics Cluster Toolkit (SGCT) is developed at Linkoping University. It is a
cross-platform C++ library enabling graphics synchronization over a cluster of com-
puters. A rough and basic implementation of SGCT is used as the rendering system
for the work presented in this thesis, enabling dome rendering and stereography as well
as standard desktop rendering.
38
9
Results
9.1 Hardware
Development, rendering and testing have been carried out on a standard desktop com-
puter, equipped with the following hardware:
• 16 GB system memory
• SATA 2.0 SSD drive
• GPU: GeForce GTX 690, two cores with 2 GB RAM each, PCI Express 3.0
9.2 Rendering Benchmarks
Table 9.1 shows a selection of benchmarks of the different steps taken in the rendering
loop. Each measure has been determined by averaging a number of runs. The total
rendering loop time has been measured when utilizing the asynchronous execution
of the rendering, while the other time benchmarks have been measured individually.
9.3 Error Metrics Benchmarks
To benchmark the efficiency of the error metrics approach, three levels of error have
been determined. These three levels are labeled as no, low and high error. The errors
have been determined using a combination of subjective visual quality and looking at
approximately how many bricks (out of the total amount in the volume) being used
and/or cached while rendering. Both temporal and spatial errors have been accepted at
39
9. RESULTS
A B C D E F G H I J
128 128 0.02 0.01 0.07 0.019 0.028 0.05 0.00005 0.0002
128 128 0.04 0.03 0.07 0.017 0.010 0.05 0.00005 0.0002
128 16 0.02 0.01 0.14 0.023 0.062 0.10 0.010 0.0002
128 16 0.04 0.03 0.12 0.017 0.022 0.10 0.010 0.0002
128 32 0.02 0.01 0.10 0.016 0.044 0.07 0.013 0.0002
128 32 0.04 0.03 0.08 0.020 0.018 0.07 0.013 0.0002
256 32 0.02 0.01 0.50 0.036 0.069 0.42 0.010 0.0002
256 32 0.04 0.03 0.49 0.016 0.027 0.42 0.010 0.0002
256 64 0.02 0.01 0.41 0.023 0.056 0.39 0.013 0.0002
256 64 0.04 0.03 0.41 0.022 0.024 0.39 0.013 0.0002
A: Number of voxels per axis in full volume
B: Number of voxels per axis in bricks
C: Probing step size
D: Ray-caster step size
E: Total rendering loop time in seconds
F: Probing kernel execution time in seconds
G: Ray caster kernel execution time in seconds
H: Disk to PBO upload time in seconds
I: Read brick request list and build brick list time in seconds
J: Other render loop steps (proxy geometry, textures et cetera) in seconds
Table 9.1: Rendering loop benchmarking. Measurements made while rendering 256 time
steps of an ENLIL model.
the low and high error level. The low error level corresponds to a relatively poor visual
results, but still usable under some conditions. The high setting produces renderings
with very large artifacts and can only be used for benchmarking. The result is showed
in table 9.2.
The no error level uses 100% of the bricks. The low and high levels use approxi-
mately 75% and 40% of the bricks, respectively.
9.4 Visual Results
This section shows samples from interactive renderings of a CME event.
40
9.4 Visual Results
Brick size Error level Total render loop time (s)
16 no 0.12
16 low 0.16
16 high 0.07
32 no 0.10
32 low 0.07
32 high 0.05
Table 9.2: Error metrics benchmarking. Measurements made while rendering 256 time
steps of an ENLIL model with 128 voxels per axis.
Desktop Rendering
In figure 9.1 are three renderings of the same sequence, each using a different transfer
function. The three screenshots from each sequence are from the beginning, middle
and end of the visible CME event.
Dome Rendering
The Hayden Planetarium at the American Museum of Natural History in New York,
USA, inhabits an immersive fulldome theater. This theater is used for high quality space
productions, both pre-rendered and interactive. AMNH is an important collaborator
in the Open Space project, and a test run of a cluster implementation of the rendering
software was successfully carried out on-site in the planetarium. See figure 9.2 for a
photo of the occasion.
41
9. RESULTS
Figure 9.1: Rendering screenshots, each column with a different transfer function.
42
9.4 Visual Results
Figure 9.2: SGCT was used to enable this interactive space weather rendering at the
Hayden Planetarium at the American Musem of Natural History in New York, USA.
43
9. RESULTS
44
10
Discussion and Future Work
10.1 Visual Quality
Rendering
The visual quality of the renderings is adequate given the simple rendering approach.
While the produced images are correct and informative, it would be desirable to increase
the resolution of the volumes further, as volumes of 128 or 256 voxels per axis often
will be too low-resolution for real applications.
10.2 Interactivity
Performance
In an interactive application, performance is obviously crucial. A certain framerate
has to be reached to both give the user a good viewing experience, as well as making
interactions responsive. If the framerate drops too low, interactions will lose intuition
and usefulness. The performance measurements have shown that the application can
run on a consumer-grade desktop computer and reach good framerates for the used
volumes.
45
10. DISCUSSION AND FUTURE WORK
10.3 Pipeline
Encapsulation
The choice to split the software intro three individual parts has been one of the major
decisions in the development. While the pipeline has not yet been fully utilized by
trying different data sources, one of the requirements has been to handle a large variety
of data sources. The encapsulation provides a nice funnel with which new kinds of
volumetric data can be added without altering the tree structure or rendering. In the
same way, the rendering or the tree construction can be changed without worrying
about the other stages of the pipeline. In an experimental proof-of-concept application
like the one implemented, this has been very important.
10.4 TSP Tree Structure
Construction
The implementation of the TSP tree construction is relatively straight-forward. The
focus has been to produce correct and robust results rather than making the process
fast. This approach has also been taken while developing the use of several temporary
files on disk during the creation process. The technique effectively erases many concerns
related to memory availability when constructing potentially very large tree structures.
Naturally, the trade-off for this capability is speed. A great increase in efficiency could
be achieved by developing a more dynamic solution where fast system memory is being
used as much as possible, only switching to disk when needed. The algorithm could
also benefit greatly from parallelization, but that also requires not depending on the
slow disk read/write bandwidth for many operations.
Storage
Storing the raw data on disk and the tree structure in memory has proven to be an
efficient solution, used in projects of larger scale. The time spent on constructing the
structure and uploading it to the graphics card is very small compared to the transfer
of data or kernel execution, and the reads from the structure in the kernels are also a
very small and quick part of the algorithm.
46
10.5 Data Formats
The key to this approach is the use of bricks. The brick concept is fundamental
since it provides a way to make the tree structure several orders of magnitudes smaller
than the complete structure. It also provides a natural domain in which to filter and
calculate error metrics, as bricks have a spatial meaning. For caching purposes, it
is important that the bricks do not themselves need to know their place in the full
structure. This requires that the tree structure is kept in order at all time but, again,
the overhead of this structure is very small compared to the benefits of being able to
put bricks in arbitrary positions on the texture atlas that gets uploaded to the GPU.
10.5 Data Formats
The data format chosen for the implementation reflects the encapsulation in the pipeline,
being somewhat redundant and requiring careful structuring. As we have seen, the in-
dividual parts of the pipeline can only communicate using files, so it is very important
that the data formats are kept intact to avoid changes in many parts of the software.
Considering this, it would be desirable to further break out and abstract the data for-
mat definition outside the current pipeline and make the read and write operations
more flexible. For example, in a larger scale implementations, it needs to be easier to
add an extra variable to a header.
10.6 Error Metrics
Calculation
The visual and performance-related results have shown that the concept of error metrics
can be useful. An increased error tolerance yields a shorter rendering time as less
bricks need to be uploaded and traversed. Additionally, the way of calculating these
metrics does take spatial and temporal coherence into account. Spatial errors with less
variations do get more heavily filtered compared to areas with more changes, such as
the areas where a CME front develops.
However, there is much to improve in this area. Since the background winds in a
CME simulation are inherently rotating, using filtering in the temporal domain quickly
leads to very visible artifacts. The rotation of the magnetic fields needs to be smooth
for a good visual experience. While this rotation makes temporal filtering hard in the
47
10. DISCUSSION AND FUTURE WORK
current software state, there are very large gains to be made with a more sophisticated
implementation. If the movements of the background winds can be predicted, it would
be possible to reuse large portions of the data by merely changing the position accord-
ingly. There is often no need to update this data as it rarely changes intensity, only
position.
Another area that needs to be improved before the error metrics can be truly useful
is the calculation efficiency. As with the TSP tree construction, the implementation is
currently very straightforward and unoptimized. Traversing the tree structure several
times to calculate averages over both the spatial and temporal domains leads to an
unacceptable time complexity. Ellsworth et. al (6) show alternative implementations
of error calculations. However, that approach relies on errors based on color, which has
proven to be troublesome (see chapter 5).
As discussed in chapter 5.4, the original implementation has been altered. Working
with large areas with intensities close to zero leads to numerical problems. The same
type of problem arose when exploring the color-based approach. Using color as a
reference could be beneficial for visual results, but using color as ordinal values has its
drawbacks. Colors with small intensities (large, black areas) again lead to numerical
problems and inconsistencies. Ideally, the error metrics calculation should be able to
calculate coherency uniformly in dark as well as bright areas.
Control
The current error metrics implementation has two major drawbacks. Error tolerances
cannot be adjusted in real time, and measurements are not based on color. This means
that the efficiency might be visually very different for different transfer functions, and
also that it is very hard to see the effects fast. For future work, a further investigation
of color-based and real-time error metrics could prove very useful.
Choosing the correct brick size is very important, and the brick approach means
that a key trade-off has to be made. With small bricks, the filtering schemes and error
calculation is more fine-grained. Errors will be calculated for smaller areas and visual
artifacts may be smaller. On the other hand, the tree structure gets larger and the
traversal slower. The overhead from duplicating border bricks in the padding step also
gets larger, but that might not be a problem unless bricks get very small.
48
10.7 Rendering
10.7 Rendering
As the focus of this project has been put on the pipeline and the preparing stages,
the rendering technique can be improved substantially. While the data request and
rendering loop have been developed to fit the pipeline, the ray-casting rendering itself
is relatively unsophisticated. There are several ways to improve this. For example, a
volume sampling scheme capable of adapting to the detail level of the volume would
save a lot of samples and improve the visual quality.
It is very important to be able to integrate various kinds of data and objects in the
future Open Space project. Planets, spacecraft, text labels and field lines are just some
examples of items that could fit into a scene. This has to be taken into consideration
when further developing the rendering system and choosing the appropriate methods.
As these items, or any other phenomena to be rendered, can have very large variations
in scale, adaptiveness is important not just in a volume data set but for all kinds of
data in a scene.
The dual-loop implementation with one data request pass and the subsequent ren-
dering pass is in principle a simplified version of the advanced approach presented in
GigaVoxels (7). While GigaVoxels’s focus is shifted towards static (but large) scenes,
there are elements that could prove useful for future work. For example, the GigaVoxels
Cache Manager using a Least Recently Used updating policy could prove efficient in
combination with a further developed temporal caching approach. The data streaming
system in GigaVoxels also takes visibility into account. That is important for scenes
originating from mesh data, but not as crucial in volumetric scenes where the whole
volume is visible. On the other hand, a rendering scheme that can provide varying
levels of details is desirable. For example, spending less time on far-away voxels or
voxels that won’t contribute to an already saturated viewing ray would save a lot of
processing power.
When using relatively small bricks, the brick padding solution can mean a dou-
bling of the volume size. As smaller bricks might be desirable for fine grained error
calculations, one has to be careful when choosing the size of the bricks. The balance
between error metrics control, traversal speed and storage size must be adapted to each
application.
49
10. DISCUSSION AND FUTURE WORK
50
11
Conclusions
The renderings produced using the implemented system have been correct, useful and
running at interactive rates. The visual quality is good, while allowing for several fur-
ther improvements. The system can produce these images on a consumer-grade hard-
ware configuration as well in a clustered environment, showing flexibility and adaptabil-
ity. As shown before, volumetric rendering is very useful for visualizing space related
data in 3D.
For a larger-scale system, such as the future Open Space project, some important
areas of improvement can be summarized:
• The error metrics system needs to be more stable, efficient and intuitive. A
color-based, real-time solution would be desirable.
• The TSP tree solution can be useful after optimizing the construction stages and
the GPU traversal scheme.
• While the basic concepts of the brick uploading, caching and traversal work well, a
more mature system needs a dynamic approach where the system can seamlessly
switch between in-memory scenes and disk uploads could reduce overhead for
small scenes or systems with large amounts of memory available.
Equally important, there are also specific approaches that have shown to be useful:
• Encapsulation at logical places in the pipeline is important for flexibility and
adaptability. This allows certain areas to be improved or changed without affect-
51
11. CONCLUSIONS
ing the other parts of the system. It is important to decide on these stages early
in development.
• Using bricks for the data and brick pointer for the in-memory traversal enables
on-demand data uploads, something that is absolutely vital in large scenes that
do not fit into memory. The choice of brick size is very important, and balances
many performance aspects.
• Storing and building the tree structures on disk is important for letting the system
scale and handle very large amounts of data.
52
References
[1] Commitee on Solar and Space Physics and Comittee on Solar-Terrestrial Research.
Space weather: A research perspective, 1997. 5
[2] The National Space Weather Program Council. The national space weather pro-
gram - the strategic plan, 1995. 5
[3] M. Tornros. Interactive visualization of space weather data. Master’s thesis,
Linkoping University, Sweden, 2013. 6
[4] Jennis Meyer-Spradow, Timo Ropinski, Jorg Mensmann, and Klaus Hinrichs.
Voreen: A rapid-prototyping environment for ray-casting-based volume visual-
izations. In IEEE Computer Graphics and Applications, Volume 29, Number 6,
pages 6–13, 2009. 6
[5] Han-Wei Shen, Ling-Jen Chiang, and Kwan-Liu Ma. A fast volume rendering
algorithm for time-varying fields using a time-space partitioning (TSP) tree. In
Proceedings of the Conference on Visualization ’99: Celebrating Ten Years, VIS
’99, pages 371–377, Los Alamitos, CA, USA, 1999. IEEE Computer Society Press.
11, 13, 17, 19, 20
[6] David Ellsworth, Ling-Jen Chiang, and Han-Wei Shen. Accelerating time-varying
hardware volume rendering using TSP trees and color-based error metrics. In
Proceedings of the 2000 IEEE Symposium on Volume Visualization, VVS ’00, pages
119–128, New York, NY, USA, 2000. ACM. 11, 13, 17, 33, 48
[7] Cyril Crassin, Fabrice Neyret, Sylvain Lefebvre, and Elmar Eisemann. Gigavoxels:
Ray-guided streaming for efficient and detailed voxel rendering. In Proceedings of
53
REFERENCES
the 2009 Symposium on Interactive 3D Graphics and Games, I3D ’09, pages 15–22,
New York, NY, USA, 2009. ACM. 13, 17, 49
[8] Daniel Reiter Horn, Jeremy Sugerman, Mike Houston, and Pat Hanrahan. Interac-
tive k-d tree GPU raytracing. In Proceedings of the 2007 symposium on Interactive
3D graphics and games, I3D ’07, pages 167–174, New York, NY, USA, 2007. ACM.
22
[9] Hong Xie, Leon Ofman, and Gareth Lawrence. Cone model for halo CMEs: Ap-
plication to space weather forecasting. J. Geophys. Res., 109(A03109), 2004. 23
[10] Community Coordinated Modeling Center. http://ccmc.gsfc.nasa.gov/. Ac-
cessed: 2014-02-09. 23
[11] Community Coordinated Modeling Center. Kameleon - conversion, access, interpo-
lation. http://ccmc.gsfc.nasa.gov/downloads/kameleon.pdf, 2006. Accessed:
2014-02-09. 24
[12] G. M. Morton. A computer oriented geodetic data base and a new technique in
file sequencing. In IBM Germany Scientific Symposium Series, 1966. 29
[13] J. Kruger and R. Westermann. Acceleration techniques for GPU-based volume
rendering. In Proceedings of the 14th IEEE Visualization 2003 (VIS’03), VIS ’03,
pages 38–, Washington, DC, USA, 2003. IEEE Computer Society. 33
[14] NVIDIA Corporation. NVIDIA OpenCL best practices guide, 2009. 37
54
Appendix A
Code Samples
A.1 TSP Tree Traversal
1 // OpenCL ke rne l
2
3 // Mirrors s t r u c t on hos t s i d e
4 struct Traversa lConstants {
5 int gridType ;
6 f loat s t e p s i z e ;
7 int numTimesteps ;
8 int numValuesPerNode ;
9 int numOTNodes ;
10 f loat tempora lTo lerance ;
11 f loat s p a t i a lTo l e r an c e ;
12 } ;
13
14 // Return index to l e f t BST c h i l d ( low timespan )
15 int LeftBST ( int bstNodeIndex , int numValuesPerNode , int numOTNodes ,
16 bool bstRoot , g l o b a l r e ad on l y int ∗ t sp ) {
17 // I f the BST node i s a root , the c h i l d po in t e r i s used f o r the OT.
18 // The c h i l d index i s next to the roo t .
19 // I f not root , l ook up in TSP s t r u c t u r e .
20 i f ( bstRoot ) {
21 return bstNodeIndex + numOTNodes ;
22 // re turn bs tNodeIndex + 1;
23 } else {
24 return t sp [ bstNodeIndex ∗ numValuesPerNode + 1 ] ;
25 }
26 }
27
55
A. CODE SAMPLES
28 // Return index to r i g h t BST c h i l d ( h igh timespan )
29 int RightBST( int bstNodeIndex , int numValuesPerNode , int numOTNodes ,
30 bool bstRoot , g l o b a l r e ad on l y int ∗ t sp ) {
31 i f ( bstRoot ) {
32 return bstNodeIndex + numOTNodes∗2 ;
33 } else {
34 return t sp [ bstNodeIndex ∗ numValuesPerNode + 1 ] + numOTNodes ;
35 }
36 }
37
38 // Return c h i l d node index g iven a BST node , a time span and a t imes t ep
39 // Updates timespan
40 int ChildNodeIndex ( int bstNodeIndex , int ∗ t imespanStart , int ∗
timespanEnd ,
41 int t imestep , int numValuesPerNode , int numOTNodes ,
42 bool bstRoot , g l o b a l r e ad on l y int ∗ t sp ) {
43 // Choose l e f t or r i g h t c h i l d
44 int middle = ∗ t imespanStart + (∗ timespanEnd − ∗ t imespanStart ) /2 ;
45 i f ( t imes t ep <= middle ) {
46 // Le f t s u b t r e e
47 ∗ timespanEnd = middle ;
48 return LeftBST ( bstNodeIndex , numValuesPerNode , numOTNodes ,
49 bstRoot , t sp ) ;
50 } else {
51 // Right su b t r e e
52 ∗ t imespanStart = middle+1;
53 return RightBST( bstNodeIndex , numValuesPerNode , numOTNodes ,
54 bstRoot , t sp ) ;
55 }
56 }
57
58 // Return the b r i c k index t ha t a BST node r ep r e s en t s
59 int BrickIndex ( int bstNodeIndex , int numValuesPerNode ,
60 g l o b a l r e ad on l y int ∗ t sp ) {
61 return t sp [ bstNodeIndex ∗ numValuesPerNode + 0 ] ;
62 }
63
64 // Checks i f a BST node i s a l e a f o t not
65 bool IsBSTLeaf ( int bstNodeIndex , int numValuesPerNode ,
66 bool bstRoot , g l o b a l r e ad on l y int ∗ t sp ) {
67 i f ( bstRoot ) return fa l se ;
68 return ( t sp [ bstNodeIndex ∗ numValuesPerNode + 1 ] == −1) ;
69 }
70
71 // Checks i f an OT node i s a l e a f or not
56
A.1 TSP Tree Traversal
72 bool I sOct reeLea f ( int otNodeIndex , int numValuesPerNode ,
73 g l o b a l r e ad on l y int ∗ t sp ) {
74 // CHILD INDEX i s at o f f s e t 1 , and −1 r ep r e s en t s l e a f
75 return ( t sp [ otNodeIndex∗ numValuesPerNode + 1 ] == −1) ;
76 }
77
78 // Return OT ch i l d index g iven curren t node and c h i l d number (0−7)
79 int OTChildIndex ( int otNodeIndex , int numValuesPerNode ,
80 int ch i l d ,
81 g l o b a l r e ad on l y int ∗ t sp ) {
82 int f i r s t C h i l d = t sp [ otNodeIndex∗ numValuesPerNode + 1 ] ;
83 return f i r s t C h i l d + ch i l d ;
84 }
85
86
87 f loat TemporalError ( int bstNodeIndex , int numValuesPerNode ,
88 g l o b a l r e ad on l y int ∗ t sp ) {
89 return a s f l o a t ( t sp [ bstNodeIndex ∗ numValuesPerNode + 3 ] ) ;
90 }
91
92 f loat Spa t i a lE r ro r ( int bstNodeIndex , int numValuesPerNode ,
93 g l o b a l r e ad on l y int ∗ t sp ) {
94 return a s f l o a t ( t sp [ bstNodeIndex ∗ numValuesPerNode + 2 ] ) ;
95 }
96
97 // Given a point , a box mid va lue and an o f f s e t ,
98 // retuen enc l o s i n g oc t r e e c h i l d
99 int Enc los ingChi ld ( f l o a t 3 P , f loat boxMid , f l o a t 3 o f f s e t ) {
100 i f ( P . x < boxMid+ o f f s e t . x ) {
101 i f ( P . y < boxMid+ o f f s e t . y ) {
102 i f ( P . z < boxMid+ o f f s e t . z ) {
103 return 0 ;
104 } else {
105 return 4 ;
106 }
107 } else {
108 i f ( P . z < boxMid+ o f f s e t . z ) {
109 return 2 ;
110 } else {
111 return 6 ;
112 }
113 }
114 } else {
115 i f ( P . y < boxMid+ o f f s e t . y ) {
116 i f ( P . z < boxMid+ o f f s e t . z ) {
57
A. CODE SAMPLES
117 return 1 ;
118 } else {
119 return 5 ;
120 }
121 } else {
122 i f ( P . z < boxMid+ o f f s e t . z ) {
123 return 3 ;
124 } else {
125 return 7 ;
126 }
127 }
128 }
129 }
130
131 // Update oc t r e e o f f s e t
132 void UpdateOffset ( f l o a t 3 ∗ o f f s e t , f loat boxDim , int c h i l d ) {
133 i f ( c h i l d == 0) {
134 // do noth ing
135 } else i f ( c h i l d == 1) {
136 o f f s e t −>x += boxDim ;
137 } else i f ( c h i l d == 2) {
138 o f f s e t −>y += boxDim ;
139 } else i f ( c h i l d == 3) {
140 o f f s e t −>x += boxDim ;
141 o f f s e t −>y += boxDim ;
142 } else i f ( c h i l d == 4) {
143 o f f s e t −>z += boxDim ;
144 } else i f ( c h i l d == 5) {
145 o f f s e t −>x += boxDim ;
146 o f f s e t −>z += boxDim ;
147 } else i f ( c h i l d == 6) {
148 o f f s e t −>y += boxDim ;
149 o f f s e t −>z += boxDim ;
150 } else i f ( c h i l d == 7) {
151 ∗ o f f s e t += ( f l o a t 3 ) ( boxDim) ;
152 }
153 }
154
155 // Given an oc t r e e node index , t r a v e r s e the corresponding BST t r e e and
look
156 // f o r a u s e f u l b r i c k .
157 bool TraverseBST ( int otNodeIndex , int ∗ br i ckIndex , int t imestep ,
158 con s t an t struct Traversa lConstants ∗ cons tant s ,
159 g l o b a l volat i le int ∗ r eqL i s t ,
160 g l o b a l r e ad on l y int ∗ t sp ) {
58
A.1 TSP Tree Traversal
161
162 // S ta r t a t the roo t o f the curren t BST
163 int bstNodeIndex = otNodeIndex ;
164 bool bstRoot = true ;
165 int t imespanStart = 0 ;
166 int timespanEnd = constant s−>numTimesteps ;
167
168 // Rely on s t r u c t u r e f o r terminat ion
169 while ( true ) {
170
171 // Update b r i c k index ( r e g a r d l e s s i f we use i t or not )
172 ∗ br i ck Index = BrickIndex ( bstNodeIndex ,
173 cons tant s−>numValuesPerNode ,
174 t sp ) ;
175
176 // I f temporal e r ror i s ok
177 i f ( TemporalError ( bstNodeIndex , cons tant s−>numValuesPerNode ,
178 t sp ) <= constant s−>tempora lTo lerance ) {
179
180 // I f the ot node i s a l e a f , we can ’ t do any b e t t e r s p a t i a l l y so we
181 // re turn the curren t b r i c k
182 i f ( I sOct reeLea f ( otNodeIndex , cons tant s−>numValuesPerNode , t sp ) )
{
183 return true ;
184
185 // A l l i s w e l l !
186 } else i f ( Spa t i a lE r ro r ( bstNodeIndex , cons tant s−>numValuesPerNode ,
187 t sp ) <= constant s−>s p a t i a lTo l e r an c e ) {
188 return true ;
189
190 // I f s p a t i a l f a i l e d and the BST node i s a l e a f
191 // The t r a v e r s a l w i l l cont inue in the oc t r e e (we know tha t
192 // the oc t r e e node i s not a l e a f )
193 } else i f ( IsBSTLeaf ( bstNodeIndex , cons tant s−>numValuesPerNode ,
194 bstRoot , t sp ) ) {
195 return fa l se ;
196
197 // Keep t r a v e r s i n g BST
198 } else {
199 bstNodeIndex = ChildNodeIndex ( bstNodeIndex , ×panStart ,
200 ×panEnd , t imestep ,
201 cons tant s−>numValuesPerNode ,
202 cons tant s−>numOTNodes ,
203 bstRoot , t sp ) ;
204 }
59
A. CODE SAMPLES
205
206 // I f temporal e r ror i s too b i g and the node i s a l e a f
207 // Return f a l s e to t r a v e r s e OT
208 } else i f ( IsBSTLeaf ( bstNodeIndex , cons tant s−>numValuesPerNode ,
209 bstRoot , t sp ) ) {
210 return fa l se ;
211
212 // I f temporal e r ror i s too b i g and we can cont inue
213 } else {
214 bstNodeIndex = ChildNodeIndex ( bstNodeIndex , ×panStart ,
215 ×panEnd , t imestep ,
216 cons tant s−>numValuesPerNode ,
217 cons tant s−>numOTNodes ,
218 bstRoot , t sp ) ;
219 }
220
221 bstRoot = fa l se ;
222 }
223 }
224
225
226 // Traverse one ray through the volume , b u i l d b r i c k l i s t
227 void TraverseOctree ( f l o a t 3 rayO ,
228 f l o a t 3 rayD ,
229 f loat maxDist ,
230 con s t an t struct Traversa lConstants ∗ cons tant s ,
231 g l o b a l volat i le int ∗ r eqL i s t ,
232 g l o b a l r e ad on l y int ∗ t sp ,
233 const int t imes t ep ) {
234
235 f loat s t e p s i z e = constant s−>s t e p s i z e ;
236 f l o a t 3 P = rayO ;
237 // Keep t r a v e r s i n g u n t i l the sample po in t goes ou t s i d e the un i t cube
238 f loat t r ave r s ed = 0 . 0 ;
239 while ( t r ave r s ed < maxDist ) {
240
241 // Reset t r a v e r s a l v a r i a b l e s
242 f l o a t 3 o f f s e t = ( f l o a t 3 ) ( 0 . 0 ) ;
243 f loat boxDim = 1 . 0 ;
244 int ch i l d ;
245
246 // I n i t the oc t r e e node index to the roo t
247 int otNodeIndex = OctreeRootNodeIndex ( ) ;
248
249 // S ta r t t r a v e r s i n g oc t r e e
60
A.1 TSP Tree Traversal
250 // Rely on f i n d i n g a l e a f f o r loop terminat ion
251 while ( true ) {
252
253 // See i f the BST t r e e i s good enough
254 int br i ckIndex = 0 ;
255 bool bs tSucce s s = TraverseBST ( otNodeIndex , &brickIndex , t imestep ,
256 cons tant s , r eqL i s t , t s p ) ;
257
258 i f ( b s tSucce s s ) {
259
260 // V i s i t and use b r i c k ( e . g . prob ing or render ing )
261 UseBrick ( br i ck Index ) ;
262 // We are now done wi th t h i s node , so go to next
263 break ;
264
265 } else i f ( I sOct reeLea f ( otNodeIndex ,
266 cons tant s−>numValuesPerNode , t sp ) ) {
267 // I f the BST lookup f a i l e d but the oc t r e e node i s a l e a f ,
268 // use the b r i c k anyway ( i t i s the BST l e a f )
269 UseBrick ( br i ck Index ) ;
270 // We are now done wi th t h i s node , so go to next
271 break ;
272
273 } else {
274 // I f the BST lookup f a i l e d and we can t r a v e r s e the octree ,
275 // v i s i t the c h i l d t h a t enc l o s e s the po in t
276
277 // Next box dimension
278 boxDim = boxDim /2 . 0 ;
279
280 // Current mid po in t
281 f loat boxMid = boxDim ;
282
283 // Check which c h i l d enc l o s e s P
284
285 i f ( cons tant s−>gridType == 0) { // Cartes ian
286 ch i l d = Enc los ingChi ld (P, boxMid , o f f s e t ) ;
287 } else { // Sphe r i c a l (==1)
288 ch i l d = Enc los ingChi ld ( Carte s ianToSpher i ca l (P) , boxMid , o f f s e t ) ;
289 }
290
291 // Update o f f s e t
292 UpdateOffset(& o f f s e t , boxDim , ch i l d ) ;
293
294 // Update node index to new node
61
A. CODE SAMPLES
295 int oldIndex = otNodeIndex ;
296 otNodeIndex = OTChildIndex ( otNodeIndex , cons tant s−>
numValuesPerNode ,
297 ch i ld , t sp ) ;
298 }
299
300 } // wh i l e t r a v e r s i n g
301
302 // Update
303 t rave r s ed += s t e p s i z e ;
304 P += s t e p s i z e ∗ rayD ;
305
306 } // wh i l e ( t r a v e r s ed < maxDist )
307 }
A.2 Brick Padding
1 // Loop over a l l t imes t ep s
2 for (unsigned int i =0; i<numTimesteps ; ++i ) {
3
4 // Storage f o r one time s t ep o f the raw data
5 std : : vector<f loat> t imestepData ( xDim ∗yDim ∗zDim ,
6 static cast<f loat >(0) ) ;
7
8 // Point to the r i g h t p o s i t i o n in the f i l e stream and read i t
9 o f f t imes t epS i z e = xDim ∗yDim ∗zDim ∗ da taS i z e ;
10 o f f t ime s t epOf f s e t = static cast<o f f >( i ) ∗ t imes t epS i z e+heade rO f f s e t ;
11 f s e eko ( in , t imes tepOf f s e t , SEEK SET) ;
12 f r ead ( reinterpret cast<void∗>(×tepData [ 0 ] ) , t imestepS ize , 1 , in ) ;
13
14 // We now have a non−padded time step , and need to pad the borders
15
16 // A l l o ca t e space f o r the padded data
17 std : : vector<f loat> paddedData ( xPaddedDim ∗yPaddedDim ∗zPaddedDim ,
18 static cast<f loat >(0) ) ;
19
20 // Loop over the padded volume t ha t we want to f i l l
21 // xp = ”x padded”
22 // xo = ”x o r i g i n a l ”
23 unsigned int xo , yo , zo ;
24 for (unsigned int zp=0; zp<zPaddedDim ; ++zp ) {
25 for (unsigned int yp=0; yp<yPaddedDim ; ++yp) {
26 for (unsigned int xp=0; xp<xPaddedDim ; ++xp) {
62
A.2 Brick Padding
27
28 i f ( xp == 0) {
29 xo = xp ;
30 } else i f ( xp == xPaddedDim −1) {
31 xo = xp−2;
32 } else {
33 xo = xp−1;
34 }
35
36 i f ( yp == 0) {
37 yo = yp ;
38 } else i f ( yp == yPaddedDim −1) {
39 yo = yp−2;
40 } else {
41 yo = yp−1;
42 }
43
44 i f ( zp == 0) {
45 zo = zp ;
46 } else i f ( zp == zPaddedDim −1) {
47 zo = zp−2;
48 } else {
49 zo = zp−1;
50 }
51
52 paddedData [ xp + yp∗xPaddedDim + zp∗xPaddedDim ∗yPaddedDim ] =
53 timestepData [ xo + yo∗xDim + zo∗yDim ∗zDim ] ;
54 }
55 }
56 }
57
58 // Create a conta iner f o r the oc t r e e l e a f l e v e l b r i c k s
59 std : : vector<Brick<f loat>∗ > baseLeve lBr i cks ( numBricksBaseLevel , NULL) ;
60
61 // Loop over the volume ’ s subvolumes and c rea t e one b r i c k f o r each
62 for (unsigned int zBr ick=0; zBrick<zNumBricks ; ++zBrick ) {
63 for (unsigned int yBrick=0; yBrick<yNumBricks ; ++yBrick ) {
64 for (unsigned int xBrick=0; xBrick<xNumBricks ; ++xBrick ) {
65
66 Brick<f loat> ∗ br i ck = Brick<f loat > : :New( xPaddedBrickDim ,
67 yPaddedBrickDim ,
68 zPaddedBrickDim ,
69 static cast<f loat >(0) ) ;
70
71 // Loop over the subvolume ’ s v o x e l s
63
A. CODE SAMPLES
72 unsigned int xMin = xBrick ∗ xBrickDim ;
73 unsigned int xMax = ( xBrick + 1) ∗xBrickDim −1+paddingWidth ∗2 ;
74 unsigned int yMin = yBrick ∗ yBrickDim ;
75 unsigned int yMax = ( yBrick + 1) ∗yBrickDim −1+paddingWidth ∗2 ;
76 unsigned int zMin = zBrick ∗ zBrickDim ;
77 unsigned int zMax = ( zBrick + 1) ∗zBrickDim −1+paddingWidth ∗2 ;
78
79 unsigned int zLoc= 0 ;
80 for (unsigned int zSub=zMin ; zSub<=zMax ; ++zSub ) {
81 unsigned int yLoc = 0 ;
82 for (unsigned int ySub=yMin ; ySub<=yMax ; ++ySub ) {
83 unsigned int xLoc = 0 ;
84 for (unsigned int xSub=xMin ; xSub<=xMax ; ++xSub ) {
85 // Look up g l o b a l index in f u l l volume
86 unsigned int g loba l Index =
87 xSub + ySub∗xPaddedDim + zSub∗xPaddedDim ∗yPaddedDim ;
88 // Set data at l o c a l subvolume index
89 br ick−>SetData ( xLoc , yLoc , zLoc , paddedData [ g l oba l Index ] ) ;
90 xLoc++;
91 }
92 yLoc++;
93 }
94 zLoc++;
95 }
96
97 // Save to oc t r e e l e a f l e v e l
98 unsigned int br i ckIndex =
99 xBrick + yBrick ∗xNumBricks + zBrick ∗xNumBricks ∗yNumBricks ;
100 baseLeve lBr i cks [ br i ck Index ] = br i ck ;
101 }
102 }
103 }
104 }
A.3 Octree Construction
1 // Loop over a l l t imes t ep s
2 for (unsigned int i =0; i<numTimesteps ; ++i ) {
3
4 // Make a conta iner f o r a l l t he oc t r e e b r i c k s
5 std : : vector<Brick<f loat>∗ > o c t r e eBr i c k s ( numBricksPerOctree ) ;
6
7 // Use Z−order coord ina t e s to rearrange the base l e v e l b r i c k s
64
A.3 Octree Construction
8 // so t ha t the e i g h t c h i l d r en f o r each parent node l i e
9 // next to each o ther
10 for ( u i n t 16 t z=0; z<static cast<u int16 t >(xNumBricks ) ; ++z ) {
11 for ( u i n t 16 t y=0; y<static cast<u int16 t >(yNumBricks ) ; ++y) {
12 for ( u i n t 16 t x=0; x<static cast<u int16 t >(zNumBricks ) ; ++x) {
13 unsigned int zOrderIdx =
14 static cast<unsigned int>(ZOrder (x , y , z ) ) ;
15 unsigned int idx = x + y∗xNumBricks + z∗xNumBricks ∗yNumBricks ;
16 o c t r e eBr i c k s [ zOrderIdx ] = baseLeve lBr i cks [ idx ] ;
17 }
18 }
19 }
20
21 // Construct h i ghe r l e v e l s o f o c t r e e
22
23 // Pos i t i on f o r next b r i ck , s t a r t i n g at p o s i t i o n beyond base l e v e l
24 unsigned int br ickPos = numBricksBaseLevel ;
25 // Pos i t i on f o r f i r s t c h i l d to average
26 unsigned int ch i ldPos = 0 ;
27
28 while ( br ickPos < numBricksPerOctree ) {
29 // F i l t e r the e i g h t c h i l d r en and then combine them to b u i l d
30 // the h i ghe r l e v e l node
31 std : : vector<Brick<f loat>∗ > f i l t e r e dCh i l d r e n (8 , NULL) ;
32 unsigned int i =0;
33 for (unsigned int ch i l d=ch i ldPos ; ch i ld<ch i ldPos+8; ++ch i l d ) {
34 Brick<f loat> ∗ f i l t e r e dCh i l d =
35 Brick<f loat > : : F i l t e r ( o c t r e eBr i c k s [ c h i l d ] ) ;
36 f i l t e r e dCh i l d r e n [ i++] = f i l t e r e dCh i l d ;
37 }
38 Brick<f loat> ∗newBrick = Brick<f loat > : : Combine ( f i l t e r e dCh i l d r e n ) ;
39
40 // Free up some memory
41 for (auto i t=f i l t e r e dCh i l d r e n . begin ( ) ;
42 i t != f i l t e r e dCh i l d r e n . end ( ) ; ++i t ) {
43 delete ∗ i t ;
44 ∗ i t = NULL;
45 }
46
47 // Set next c h i l d pos
48 ch i ldPos += 8 ;
49
50 // Save new b r i c k
51 o c t r e eBr i c k s [ br ickPos++] = newBrick ;
52 }
65
A. CODE SAMPLES
53
54 // Write oc t r e e to f i l e
55 for (auto i t=oc t r e eBr i c k s . beg in ( ) ; i t != oc t r e eBr i c k s . end ( ) ; ++i t ) {
56 fw r i t e ( reinterpret cast<void∗>(&(∗ i t )−>data [ 0 ] ) ,
57 static cast<s i z e t >((∗ i t )−>S i z e ( ) ) , 1 , out ) ;
58 // Free memory when we ’ re done
59 delete ∗ i t ;
60 }
61 }
A.4 BST Assembling
1 // I f the number o f t imes t ep s i s not a power o f two , copy the l a s t
t imes t ep s
2 // enough t imes to make the number a power o f two
3 CheckPowerOfTwo ( ) ;
4
5 // Create base l e v e l temp f i l e by r e v e r s i n g the l e v e l order
6
7 { // Scoping f i l e s
8
9 std : : FILE ∗ in = fopen ( tempFilename . c s t r ( ) , ” r ” ) ;
10 i f ( ! in ) return fa l se ;
11 std : : FILE ∗out = fopen ( newFilename . c s t r ( ) , ”w” ) ;
12 i f ( ! out ) return fa l se ;
13
14 // Read one oc t r e e l e v e l a t a time , s t a r t i n g from the back o f source
15 // Write to out f i l e in r e v e r s e order
16
17 // Pos i t i on at end o f f i l e
18 for (unsigned int t s =0; ts<numTimesteps ; ++t s ) {
19
20 o f f oct reePos=static cast<o f f >((numOTNodes) ∗numBrickVals ∗( t s+1) ) ;
21 for (unsigned int l e v e l =0; l e v e l<numLevels ; ++l e v e l ) {
22
23 unsigned int br i ck sPerLeve l = pow(8 , l e v e l ) ;
24 unsigned int va luesPerLeve l = numBrickVals∗ br i ck sPerLeve l ;
25 octreePos −= valuesPerLeve l ;
26 std : : vector<f loat> bu f f e r ( va luesPerLeve l ) ;
27
28 f s e eko ( in , oct reePos ∗( o f f ) s izeof ( f loat ) , SEEK SET) ;
29 s i z e t r eadS i z e = static cast<s i z e t >(va luesPerLeve l ) ∗ s izeof ( f loat ) ;
30 f r ead ( reinterpret cast<void∗>(&bu f f e r [ 0 ] ) , readSize , 1 , in ) ;
66
A.4 BST Assembling
31 fw r i t e ( reinterpret cast<void∗>(&bu f f e r [ 0 ] ) , readSize , 1 , out ) ;
32 }
33 }
34
35 f c l o s e ( in ) ;
36 f c l o s e ( out ) ;
37
38 } // Scoping f i l e s
39
40 // Create one f i l e f o r every l e v e l o f the BST t r e e s t r u c t u r e
41 // by averag ing the va l u e s in the one be low .
42 unsigned int numTimestepsInLevel = numTimesteps ;
43 unsigned int numValsInOT = numBrickVals∗numOTNodes ;
44 std : : vector<f loat> i nBu f f e r 1 (numValsInOT) ;
45 std : : vector<f loat> i nBu f f e r 2 (numValsInOT) ;
46 std : : vector<f loat> outBuf f e r (numValsInOT) ;
47
48 s i z e t OTBytes = static cast<s i z e t >(numValsInOT ∗ s izeof ( f loat ) ) ;
49 std : : s t r i n g fromFilename = newFilename ;
50 std : : s t r i n g toFilename ;
51
52 do {
53
54 std : : s t r i ng s t r eam s s ;
55 s s << BSTLevel − 1 ;
56 std : : cout << ”Creat ing l e v e l ” << BSTLevel << std : : endl ;
57 toFilename = tempFilename + ” . ” + s s . s t r ( ) + ” . tmp” ;
58
59 // I n i t f i l e s
60 std : : FILE ∗ in = fopen ( fromFilename . c s t r ( ) , ” r ” ) ;
61 i f ( ! in ) return fa l se ;
62 std : : FILE ∗out = fopen ( toFilename . c s t r ( ) , ”w” ) ;
63 i f ( ! out ) return fa l se ;
64
65 f s e eko ( in , 0 , SEEK END) ;
66 o f f f i l e S i z e = f t e l l o ( in ) ;
67 f s e eko ( in , 0 , SEEK SET) ;
68
69 for (unsigned int t s =0; ts<numTimestepsInLevel ; t s+=2) {
70
71 // Read two oc t r e e s ( two time s t e p s )
72 f r ead ( reinterpret cast<void∗>(&inBu f f e r 1 [ 0 ] ) , OTBytes , 1 , in ) ;
73 f r ead ( reinterpret cast<void∗>(&inBu f f e r 2 [ 0 ] ) , OTBytes , 1 , in ) ;
74
75 // Average time s t e p s
67
A. CODE SAMPLES
76 for (unsigned int i =0; i<outBuf f e r . s i z e ( ) ; ++i ) {
77 outBuf f e r [ i ] = ( inBu f f e r 1 [ i ] + inBu f f e r 2 [ i ] ) / static cast<f loat >(2)
;
78
79 }
80
81 // Write b r i c k
82 fw r i t e ( reinterpret cast<void∗>(&outBuf f e r [ 0 ] ) , OTBytes , 1 , out ) ;
83 }
84
85 fromFilename = toFilename ;
86
87 f c l o s e ( in ) ;
88 f c l o s e ( out ) ;
89
90 BSTLevel−−;
91 numTimestepsInLevel /= 2 ;
92
93 } while (BSTLevel != 0) ;
94
95 std : : FILE ∗out = fopen ( outFi lename . c s t r ( ) , ”w” ) ;
96
97 // Write metadata to f i l e
98 WriteHeader ( out ) ;
99
100 // Write each l e v e l to output
101 for (unsigned int l e v e l =0; l e v e l<numBSTLevels ; ++l e v e l ) {
102
103 std : : s t r i ng s t r eam s s ;
104 s s << l e v e l ;
105 std : : s t r i n g fromFilename = tempFilename + ” . ” + s s . s t r ( ) + ” . tmp” ;
106
107 std : : FILE ∗ in = fopen ( fromFilename . c s t r ( ) , ” r ” ) ;
108 i f ( ! in ) return fa l se ;
109
110 f s e eko ( in , 0 , SEEK END) ;
111 o f f i n F i l e S i z e = f t e l l o ( in ) ;
112 f s e eko ( in , 0 , SEEK SET) ;
113
114 std : : vector<f loat> bu f f e r ( ( s i z e t ) i nF i l e S i z e / s izeof ( f loat ) ) ;
115 // Read whole f i l e , wr i t e to out f i l e
116 f r ead ( reinterpret cast<void∗>(&bu f f e r [ 0 ] ) ,
117 static cast<s i z e t >( i nF i l e S i z e ) , 1 , in ) ;
118
119 fw r i t e ( reinterpret cast<void∗>(&bu f f e r [ 0 ] ) ,
68
A.5 TSP Tree Structure Construction
120 static cast<s i z e t >( i nF i l e S i z e ) , 1 , out ) ;
121
122 f c l o s e ( in ) ;
123 }
124 f c l o s e ( out ) ;
125
126 // Do some check ing and data v a l i d a t i o n
127 CheckFi l eS i ze ( ) ;
128 Val idateData ( ) ;
A.5 TSP Tree Structure Construction
1 void TSP : : Construct ( ) {
2 // S t ruc tu re i s saved in i n t array
3
4 // Loop over the OTs ( one per BST node )
5 for (unsigned int OT=0; OT<numBSTNodes ; ++OT) {
6
7 // S ta r t a t the roo t o f each OT
8 unsigned int OTNode = OT∗numOTNodes ;
9
10 // Ca l cu l a t e BST l e v e l ( f i r s t l e v e l i s l e v e l 0)
11 unsigned int BSTLevel = (unsigned int ) ( l og (OT+1)/ log (2 ) ) ;
12
13 // Traverse OT
14 unsigned int OTChild = 1 ;
15 unsigned int OTLevel = 0 ;
16 while (OTLevel < numOTLevels ) {
17
18 unsigned int OTNodesInLevel = static cast<unsigned int>(pow(8 ,
OTLevel ) ) ;
19
20 for (unsigned int i =0; i<OTNodesInLevel ; ++i ) {
21
22 // Brick index
23 data [OTNode∗NUMDATA + BRICK INDEX] = ( int )OTNode ;
24
25 // Error metr i c s
26 int localOTNode = (OTNode − OT∗numOTNodes ) ;
27 data [OTNode∗NUMDATA + TEMPORALERR] = ( int ) ( numBSTLevels −1−
BSTLevel ) ;
28 data [OTNode∗NUMDATA + SPATIAL ERR] = ( int ) ( numOTLevels −1−
OTLevel ) ;
69
A. CODE SAMPLES
29
30 i f (BSTLevel == 0) {
31
32 // Ca l cu l a t e OT c h i l d index (−1 i f node i s l e a f )
33 int OTChildIndex =
34 (OTChild < numOTNodes ) ? ( int ) (OT∗numOTNodes +OTChild ) : −1;
35 data [OTNode∗NUMDATA + CHILD INDEX] = OTChildIndex ;
36
37 } else {
38
39 // Ca l cu l a t e BST c h i l d index (−1 i f node i s BST l e a f )
40
41 // F i r s t BST node o f curren t l e v e l
42 int f i r s tNode =
43 static cast<unsigned int>((2∗pow(2 , BSTLevel−1)−1)∗numOTNodes
) ;
44 // F i r s t BST node o f next l e v e l
45 int f i r s t C h i l d =
46 static cast<unsigned int>((2∗pow(2 , BSTLevel )−1)∗numOTNodes ) ;
47 // Di f f e r ence between f i r s t nodes between l e v e l s
48 int l eve lGap = f i r s tCh i l d−f i r s tNode ;
49 // How many nodes away from the f i r s t node are we?
50 int o f f s e t = (OTNode−f i r s tNode ) / numOTNodes ;
51
52 // Use l e v e l gap and o f f s e t to c a l c u l a t e c h i l d index
53 int BSTChildIndex =
54 (BSTLevel < numBSTLevels −1) ?
55 ( int ) (OTNode+levelGap+( o f f s e t ∗numOTNodes ) ) : −1;
56
57 data [OTNode∗NUMDATA + CHILD INDEX] = BSTChildIndex ;
58
59 }
60
61 OTNode++;
62 OTChild += 8 ;
63 }
64
65 OTLevel++;
66 }
67 }
68 }
70
A.6 Error Metrics
A.6 Error Metrics
1 bool TSP : : Ca l cu l a t eSpa t i a lE r r o r ( ) {
2
3 unsigned int numBrickVals = paddedBrickDim ∗paddedBrickDim ∗
paddedBrickDim ;
4
5 std : : s t r i n g inFi lename = con f i g −>TSPFilename ( ) ;
6 std : : FILE ∗ in = fopen ( inFi lename . c s t r ( ) , ” r ” ) ;
7 i f ( ! in ) {
8 ERROR(” Fa i l ed to open” << inFi lename ) ;
9 return fa l se ;
10 }
11
12 std : : vector<f loat> bu f f e r ( numBrickVals ) ;
13 std : : vector<f loat> averages ( numTotalNodes ) ;
14 std : : vector<f loat> stdDevs ( numTotalNodes ) ;
15
16 // F i r s t pass : Ca l cu l a t e average co l o r f o r each b r i c k
17 INFO(”\ nCa lcu la t ing s p a t i a l e r ro r , f i r s t pass ” ) ;
18 for (unsigned int br i ck =0; br ick<numTotalNodes ; ++br i ck ) {
19
20 // O f f s e t in f i l e
21 o f f o f f s e t = dataPos + static cast<o f f >( b r i ck ∗numBrickVals∗ s izeof (
f loat ) ) ;
22 f s e eko ( in , o f f s e t , SEEK SET) ;
23
24 f r ead ( reinterpret cast<void∗>(&bu f f e r [ 0 ] ) ,
25 static cast<s i z e t >(numBrickVals ) ∗ s izeof ( f loat ) , 1 , in ) ;
26
27 f loat average = 0 . f ;
28 for (auto i t=bu f f e r . begin ( ) ; i t != bu f f e r . end ( ) ; ++i t ) {
29 average += ∗ i t ;
30 }
31
32 averages [ b r i ck ] = average / static cast<f loat>(numBrickVals ) ;
33 }
34
35 // Spa t i a l SNR s t a t s
36 f loat minError = 1 e20 f ;
37 f loat maxError = 0 . f ;
38 std : : vector<f loat> medianArray ( numTotalNodes ) ;
39
40 // Second pass : For each br i ck , compare the covered l e a f v o x e l s wi th
71
A. CODE SAMPLES
41 // the b r i c k average
42 INFO(”Ca l cu l a t ing s p a t i a l e r ro r , second pass ” ) ;
43 for (unsigned int br i ck =0; br ick<numTotalNodes ; ++br i ck ) {
44
45 // Fetch mean i n t e n s i t y
46 f loat brickAvg = averages [ b r i ck ] ;
47
48 // Sum fo r s t d dev computation
49 f loat stdDev = 0 . f ;
50
51 // Get a l i s t o f l e a f b r i c k s t h a t the curren t b r i c k covers
52 std : : l i s t <unsigned int> coveredLea fBr i cks =
53 CoveredLeafBricks ( b r i ck ) ;
54
55 // I f the b r i c k i s a l r eady a l e a f , a s s i gn a nega t i v e er ror .
56 // Ad hoc ”hack” to d i s t i n g u i s h l e a f s from other nodes t ha t happens
57 // to ge t a zero error due to rounding e r ro r s or o ther reasons .
58 i f ( coveredLea fBr i cks . s i z e ( ) == 1) {
59 stdDev = −0.1 f ;
60 } else {
61
62 // Ca l cu l a t e ” s tandard d e v i a t i on ” corresponding to l e a v e s
63 for (auto lb=coveredLea fBr i cks . beg in ( ) ;
64 lb != coveredLea fBr i cks . end ( ) ; ++lb ) {
65
66 // Read b r i c k
67 o f f o f f s e t =
68 dataPos +static cast<o f f >((∗ lb ) ∗numBrickVals∗ s izeof ( f loat ) ) ;
69 f s e eko ( in , o f f s e t , SEEK SET) ;
70 f r ead ( reinterpret cast<void∗>(&bu f f e r [ 0 ] ) ,
71 static cast<s i z e t >(numBrickVals ) ∗ s izeof ( f loat ) , 1 , in ) ;
72
73 // Add to sum
74 for (auto v=bu f f e r . begin ( ) ; v!= bu f f e r . end ( ) ; ++v) {
75 stdDev += pow(∗v−brickAvg , 2 . f ) ;
76 }
77
78 }
79
80 // Fin i sh c a l c u l a t i o n
81 i f ( s izeof ( f loat ) != s izeof ( int ) ) {
82 ERROR(”Float and i n t s i z e s don ’ t match , can ’ t r e i n t e p r e t ” ) ;
83 return fa l se ;
84 }
85
72
A.6 Error Metrics
86 stdDev /= static cast<f loat>( coveredLea fBr i cks . s i z e ( ) ∗numBrickVals ) ;
87 stdDev = sq r t ( stdDev ) ;
88
89 }
90
91 i f ( stdDev < minError ) {
92 minError = stdDev ;
93 } else i f ( stdDev > maxError ) {
94 maxError = stdDev ;
95 }
96
97 stdDevs [ b r i ck ] = stdDev ;
98 medianArray [ b r i ck ] = stdDev ;
99
100 }
101
102 f c l o s e ( in ) ;
103
104 std : : s o r t (medianArray . begin ( ) , medianArray . end ( ) ) ;
105 f loat medError = medianArray [ medianArray . s i z e ( ) / 2 ] ;
106
107 // ”Normalize ” e r ro r s
108 f loat minNorm = 1 e20 f ;
109 f loat maxNorm = 0 . f ;
110 for (unsigned int i =0; i<numTotalNodes ; ++i ) {
111 // f l o a t normal ized = ( stdDevs [ i ]−minError ) /(maxError−minError ) ;
112 i f ( stdDevs [ i ] > 0 . f ) {
113 stdDevs [ i ] = pow( stdDevs [ i ] , 0 . 5 f ) ;
114 }
115 data [ i ∗NUMDATA+SPATIAL ERR] = ∗ reinterpret cast<int∗>(&stdDevs [ i ] ) ;
116 i f ( stdDevs [ i ] < minNorm) {
117 minNorm = stdDevs [ i ] ;
118 } else i f ( stdDevs [ i ] > maxNorm) {
119 maxNorm = stdDevs [ i ] ;
120 }
121 }
122
123 std : : s o r t ( stdDevs . begin ( ) , stdDevs . end ( ) ) ;
124 f loat medNorm = stdDevs [ stdDevs . s i z e ( ) / 2 ] ;
125
126 minSpat ia lErro r = minNorm ;
127 maxSpat ia lError = maxNorm ;
128 medianSpat ia lError = medNorm ;
129
130 return true ;
73
A. CODE SAMPLES
131 }
132
133
134 bool TSP : : CalculateTemporalError ( ) {
135
136 std : : s t r i n g inFi lename = con f i g −>TSPFilename ( ) ;
137 std : : FILE ∗ in = fopen ( inFi lename . c s t r ( ) , ” r ” ) ;
138 i f ( ! in ) {
139 ERROR(” Fa i l ed to open ” << inFi lename ) ;
140 return fa l se ;
141 }
142
143 std : : vector<f loat> meanArray ( numTotalNodes ) ;
144
145 // Save e r ro r s
146 std : : vector<f loat> e r r o r s ( numTotalNodes ) ;
147
148 // Ca l cu l a t e temporal e r ror f o r one b r i c k at a time
149 for (unsigned int br i ck =0; br ick<numTotalNodes ; ++br i ck ) {
150
151 unsigned int numBrickVals =
152 paddedBrickDim ∗paddedBrickDim ∗paddedBrickDim ;
153
154 // Save the i n d i v i d u a l v o x e l ’ s average over t imes t ep s . Because the
155 // BSTs are b u i l t by averag ing l e a f nodes , we on ly need to sample
156 // the b r i c k at the co r r e c t coord ina te .
157 std : : vector<f loat> voxelAverages ( numBrickVals ) ;
158 std : : vector<f loat> voxelStdDevs ( numBrickVals ) ;
159
160 // Read the whole b r i c k to f i l l t he averages
161 o f f o f f s e t = dataPos +static cast<o f f >( b r i ck ∗numBrickVals∗ s izeof ( f loat
) ) ;
162 f s e eko ( in , o f f s e t , SEEK SET) ;
163 f r ead ( reinterpret cast<void∗>(&voxelAverages [ 0 ] ) ,
164 static cast<s i z e t >(numBrickVals ) ∗ s izeof ( f loat ) , 1 , in ) ;
165
166 // Bui ld a l i s t o f the BST l e a f b r i c k s ( w i th in the same oc t r e e l e v e l )
t h a t
167 // t h i s b r i c k covers
168 std : : l i s t <unsigned int> coveredBr i cks = CoveredBSTLeafBricks ( b r i ck ) ;
169
170 // I f the b r i c k i s a t the l owe s t BST l e v e l , au t oma t i c a l l y s e t the
er ror
171 // to −0.1 ( enab l e s us ing −1 as a marker f o r ”no error accep ted ”) ;
172 // Somewhat ad hoc to ge t around the f a c t t h a t the error cou ld be
74
A.6 Error Metrics
173 // 0.0 h i ghe r up in the t r e e
174 i f ( coveredBr i cks . s i z e ( ) == 1) {
175 e r r o r s [ b r i ck ] = −0.1 f ;
176 } else {
177
178 // Ca l cu l a t e s tandard d e v i a t i on per voxe l , average over b r i c k
179 f loat avgStdDev = 0 . f ;
180 for (unsigned int voxe l =0; voxel<numBrickVals ; ++voxe l ) {
181
182 f loat stdDev = 0 . f ;
183 for (auto l e a f = coveredBr i cks . begin ( ) ;
184 l e a f != coveredBr i cks . end ( ) ; ++l e a f ) {
185
186 // Sample the l e a v e s at the corresponding vo x e l p o s i t i o n
187 o f f sampleOf f se t = dataPos +
188 static cast<o f f >((∗ l e a f ∗numBrickVals+voxe l ) ∗ s izeof ( f loat ) ) ;
189 f s e eko ( in , sampleOffset , SEEK SET) ;
190 f loat sample ;
191 f r ead ( reinterpret cast<void∗>(&sample ) , s izeof ( f loat ) , 1 , in ) ;
192
193 stdDev += pow( sample−voxelAverages [ voxe l ] , 2 . f ) ;
194 }
195 stdDev /= static cast<f loat>( coveredBr i cks . s i z e ( ) ) ;
196 stdDev = sq r t ( stdDev ) ;
197
198 avgStdDev += stdDev ;
199 } // f o r vo x e l
200
201 avgStdDev /= static cast<f loat>(numBrickVals ) ;
202 meanArray [ b r i ck ] = avgStdDev ;
203 e r r o r s [ b r i ck ] = avgStdDev ;
204
205 }
206 } // f o r a l l b r i c k s
207
208 f c l o s e ( in ) ;
209
210 std : : s o r t (meanArray . begin ( ) , meanArray . end ( ) ) ;
211 f loat medErr = meanArray [ meanArray . s i z e ( ) / 2 ] ;
212
213 // Adjust e r ro r s us ing user−prov ided exponents
214 f loat minNorm = 1 e20 f ;
215 f loat maxNorm = 0 . f ;
216 for (unsigned int i =0; i<numTotalNodes ; ++i ) {
217 i f ( e r r o r s [ i ] > 0 . f ) {
75
A. CODE SAMPLES
218 e r r o r s [ i ] = pow( e r r o r s [ i ] , 0 .25 f ) ;
219 }
220 data [ i ∗NUMDATA+TEMPORALERR] = ∗ reinterpret cast<int∗>(&e r r o r s [ i ] ) ;
221 i f ( e r r o r s [ i ] < minNorm) {
222 minNorm = e r r o r s [ i ] ;
223 } else i f ( e r r o r s [ i ] > maxNorm) {
224 maxNorm = e r r o r s [ i ] ;
225 }
226 }
227
228 std : : s o r t ( e r r o r s . begin ( ) , e r r o r s . end ( ) ) ;
229 f loat medNorm = e r r o r s [ e r r o r s . s i z e ( ) / 2 ] ;
230
231 minTemporalError = minNorm ;
232 maxTemporalError = maxNorm ;
233 medianTemporalError = medNorm ;
234
235 return true ;
236 }
A.7 Rendering Loop
1 bool Raycaster : : Render ( ) {
2
3 // Update t rans format ion matr ices and bind them to co l o r cube shader
4 i f ( ! UpdateMatrices ( ) ) return fa l se ;
5 i f ( ! BindTransformationMatr ices ( cubeShaderProgram ) ) return fa l se ;
6
7 g lC l ea r (GL COLOR BUFFER BIT | GL DEPTH BUFFER BIT) ;
8
9 // Render cube
10 glUseProgram ( cubeShaderProgram −>Handle ( ) ) ;
11 cubePos i t i onAt t r i b = cubeShaderProgram −>GetAttr ibLocat ion ( ” po s i t i o n ” ) ;
12 glFrontFace (GLCW) ;
13 glEnable (GL CULL FACE) ;
14
15 // Front cube
16 glBindFramebuffer (GL FRAMEBUFFER, cubeFrontFBO ) ;
17 g lCul lFace (GL BACK) ;
18 glBindVertexArray (cubeVAO ) ;
19 g lB indBuf f e r (GL ARRAY BUFFER, cubePosbuf f e rObject ) ;
20 g lEnableVertexAttr ibArray ( cubePos i t i onAt t r i b ) ;
21 g lVer t exAtt r ibPo in t e r (0 , 4 , GL FLOAT, GL FALSE, 0 , 0) ;
76
A.7 Rendering Loop
22 g lC l ea r (GL COLOR BUFFER BIT | GL DEPTH BUFFER BIT) ;
23 glDrawArrays (GL TRIANGLES, 0 , 144) ;
24 g lDi sab leVertexAttr ibArray ( cubePos i t i onAt t r i b ) ;
25 g lB indBuf f e r (GL ARRAY BUFFER, 0) ;
26 glBindFramebuffer (GL FRAMEBUFFER, 0) ;
27 glBindVertexArray (0 ) ;
28
29 // Back cube
30 glBindFramebuffer (GL FRAMEBUFFER, cubeBackFBO ) ;
31 g lCul lFace (GL FRONT) ;
32 glBindVertexArray (cubeVAO ) ;
33 g lB indBuf f e r (GL ARRAY BUFFER, cubePosbuf f e rObject ) ;
34 g lEnableVertexAttr ibArray ( cubePos i t i onAt t r i b ) ;
35 g lVer t exAtt r ibPo in t e r (0 , 4 , GL FLOAT, GL FALSE, 0 , 0) ;
36 g lC l ea r (GL COLOR BUFFER BIT | GL DEPTH BUFFER BIT) ;
37 glDrawArrays (GL TRIANGLES, 0 , 144) ;
38 g lDi sab leVertexAttr ibArray ( cubePos i t i onAt t r i b ) ;
39 g lB indBuf f e r (GL ARRAY BUFFER, 0) ;
40 glBindFramebuffer (GL FRAMEBUFFER, 0) ;
41 glBindVertexArray (0 ) ;
42
43 glUseProgram (0) ;
44
45 // Get curren t and next time s t ep from separa t e Animator c l a s s
46 unsigned int currentTimestep ;
47 unsigned int nextTimestep ;
48 currentTimestep = animator −>CurrentTimestep ( ) ;
49 nextTimestep = animator −>NextTimestep ( ) ;
50
51 // Choose b u f f e r s
52 BrickManager : : BUFFER INDEX currentBuf , nextBuf ;
53 i f ( currentTimestep % 2 == 0) {
54 currentBuf = BrickManager : :EVEN;
55 nextBuf = BrickManager : :ODD;
56 } else {
57 currentBuf = BrickManager : :ODD;
58 nextBuf = BrickManager : :EVEN;
59 }
60
61 // When s t a r t i n g a render ing i t e r a t i o n , the PBO corresponding to the
62 // curren t t imes t ep i s loaded wi th the data .
63
64 // Launch t r a v e r s a l o f the next t imes t ep
65 i f ( ! LaunchTSPProbing ( nextTimestep ) ) return fa l se ;
66
77
A. CODE SAMPLES
67 // While t r a v e r s a l o f next s t ep i s working , upload curren t data to a t l a s
68 i f ( ! brickManager −>PBOToAtlas ( currentBuf ) ) return fa l se ;
69
70 // Make sure the t r a v e r s a l k e rne l i s done
71 i f ( ! clManager −>FinishProgram ( ”TSPProbing” ) ) return fa l se ;
72
73 // Read b u f f e r and r e l e a s e the memory
74 i f ( ! clManager −>ReadBuffer ( ”TSPProbing” , t spBr ickL i s tArg ,
75 reinterpret cast<void∗>(&br i ckReques t [ 0 ] ) ,
76 br i ckReques t . s i z e ( ) ∗ s izeof ( int ) ,
77 true ) ) return fa l se ;
78
79 i f ( ! clManager −>Re l ea s eBu f f e r ( ”TSPProbing” , t spBr i ckL i s tArg ) ) return
fa l se ;
80
81 // When t r a v e r s a l o f next t imes t ep i s done , launch rayca s t i n g k e rne l
82 i f ( ! clManager −>Set In t ( ”TSPRaycaster” , t imestepArg , currentTimestep ) )
83 return fa l se ;
84
85 // Add b r i c k l i s t
86 i f ( ! clManager −>
87 AddBuffer ( ”TSPRaycaster” , b r i ckL i s tArg ,
88 reinterpret cast<void∗>(&(brickManager −>Br i ckL i s t ( currentBuf ) [ 0 ] ) ) ,
89 brickManager −>Br i ckL i s t ( currentBuf ) . s i z e ( ) ∗ s izeof ( int ) ,
90 CLManager : : COPY HOST PTR,
91 CLManager : :READONLY) ) return fa l se ;
92
93 i f ( ! clManager −>PrepareProgram ( ”TSPRaycaster” ) ) return fa l se ;
94
95 i f ( ! clManager −>LaunchProgram ( ”TSPRaycaster” ,
96 winWidth ,
97 winHeight ,
98 con f i g −>LocalWorkSizeX ( ) ,
99 con f i g −>LocalWorkSizeY ( ) ) )
100 return fa l se ;
101
102 // While the rayca s t e r k e rne l i s working , b u i l d next b r i c k l i s t and
s t a r t
103 // upload to the next PBO
104 i f ( ! brickManager −>Bui ldBr i ckL i s t ( nextBuf , br i ckReques t ) ) return fa l se
;
105 i f ( ! brickManager −>DiskToPBO( nextBuf ) ) return fa l se ;
106
107 // Fin i sh rayca s t e r and render curren t frame
78
A.7 Rendering Loop
108 i f ( ! clManager −>Re l ea s eBu f f e r ( ”TSPRaycaster” , b r i ckL i s tArg ) ) return
fa l se ;
109 i f ( ! clManager −>FinishProgram ( ”TSPRaycaster” ) ) return fa l se ;
110
111 // Render to f ramebu f f e r us ing quad
112 glBindFramebuffer (GL FRAMEBUFFER, SGCTWinManager : : In s tance ( )−>FBOHandle
( ) ) ;
113
114 i f ( ! quadTex −>Bind ( quadShaderProgram , ”quadTex” , 0) ) return fa l se ;
115
116 g lD i s ab l e (GL CULL FACE) ;
117
118 glUseProgram ( quadShaderProgram −>Handle ( ) ) ;
119 quadPos i t i onAtt r ib = quadShaderProgram −>GetAttr ibLocat ion ( ” po s i t i o n ” ) ;
120 i f ( quadPos i t i onAtt r ib == −1) {
121 ERROR(”Quad po s i t i o n a t t r i b u t e lookup f a i l e d ” ) ;
122 return fa l se ;
123 }
124 g lCul lFace (GL BACK) ;
125 glBindVertexArray (quadVAO ) ;
126 g lB indBuf f e r (GL ARRAY BUFFER, quadPosbuf ferObject ) ;
127 glEnableVertexAttr ibArray ( quadPos i t i onAtt r ib ) ;
128 g lVer t exAtt r ibPo in t e r (0 , 4 , GL FLOAT, GL FALSE, 0 , 0) ;
129 g lC l ea r (GL COLOR BUFFER BIT | GL DEPTH BUFFER BIT) ;
130 glDrawArrays (GL TRIANGLES, 0 , 6) ;
131 g lDi sab leVertexAttr ibArray ( quadPos i t i onAtt r ib ) ;
132 glBindVertexArray (0 ) ;
133
134 glBindFramebuffer (GL FRAMEBUFFER, 0) ;
135
136 i f (CheckGLError ( ”Quad render ing ” ) != GL NO ERROR) {
137 return fa l se ;
138 }
139
140 glUseProgram (0) ;
141
142 // Window manager t a k e s care o f swapping b u f f e r s
143
144 return true ;
145 }
79