Dynamisk visualisering av rymdvädersimuleringsdataliu.diva-portal.org/smash/get/diva2:763094/FULLTEXT01.pdf · Dynamisk visualisering av rymdvädersimuleringsdata Victor Sand 2014-05-16

Department of Science and Technology Institutionen för teknik och naturvetenskap Linköping University Linköpings universitet

gnipökrroN 47 106 nedewS ,gnipökrroN 47 106-ES

LiU-ITN-TEK-A-14/009-SE

Dynamisk visualisering avrymdvädersimuleringsdata

Victor Sand

2014-05-16

LiU-ITN-TEK-A-14/009-SE

Dynamisk visualisering avrymdvädersimuleringsdata

Examensarbete utfört i Medieteknikvid Tekniska högskolan vid

Linköpings universitet

Victor Sand

Handledare Alexander BockExaminator Anders Ynnerman

Norrköping 2014-05-16

Upphovsrätt

Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare –under en längre tid från publiceringsdatum under förutsättning att inga extra-ordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner,skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat förickekommersiell forskning och för undervisning. Överföring av upphovsrättenvid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning avdokumentet kräver upphovsmannens medgivande. För att garantera äktheten,säkerheten och tillgängligheten finns det lösningar av teknisk och administrativart.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman iden omfattning som god sed kräver vid användning av dokumentet på ovanbeskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådanform eller i sådant sammanhang som är kränkande för upphovsmannens litteräraeller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press seförlagets hemsida http://www.ep.liu.se/

Copyright

The publishers will keep this document online on the Internet - or its possiblereplacement - for a considerable time from the date of publication barringexceptional circumstances.

The online availability of the document implies a permanent permission foranyone to read, to download, to print out single copies for your own use and touse it unchanged for any non-commercial research and educational purpose.Subsequent transfers of copyright cannot revoke this permission. All other usesof the document are conditional on the consent of the copyright owner. Thepublisher has taken technical and administrative measures to assure authenticity,security and accessibility.

According to intellectual property law the author has the right to bementioned when his/her work is accessed as described above and to be protectedagainst infringement.

For additional information about the Linköping University Electronic Pressand its procedures for publication and for assurance of document integrity,please refer to its WWW home page: http://www.ep.liu.se/

© Victor Sand

Dynamic Visualization

of Space Weather Data

Victor Sand

Civilingenjor Medieteknik

Linkoping University

Master’s thesis

Goddard Space Flight Center, Maryland, USA

Norrkoping, Sweden

June 2014

Abstract

The work described in this thesis is part of the Open Space project, a

collaboration between Linkoping University, the National Aeronautics and

Space Administration and the American Museum of Natural History. The

long-term goal of Open Space is a multi-purpose, open-source scientific vi-

sualization software.

The thesis covers the research and implementation of a pipeline for prepar-

ing and rendering volumetric data. The developed pipeline consists of three

stages: A data formatting stage which takes data from various sources and

prepares it for the rest of the pipeline, a pre-processing stage which builds a

tree structure of of the raw data, and finally an interactive rendering stage

which draws a volume using ray-casting.

Large parts of the system are built around the use of a Time-Space Parti-

tioning tree, originally described by Shen et al. This tree structure uses an

error metric system and an octree-based structure to efficiently choose the

appropriate level of detail during rendering. The data storage and structure

are similar to the one in the GigaVoxels system by Crassin et al. Using a

combination of these concepts and constructing the pipeline around them,

space weather related volumes have been successfully rendered at interactive

rates.

The pipeline is a fully working proof-of-concept for future development of

Open Space, and can be used as-is to render space weather data. Many

concepts and ideas from this work can be utilized in the larger-scale software

project.

iv

Acknowledgements

First of all, I would like to thank my examinator, professor Anders Yn-

nerman, for the fantastic opportunity and for keeping the project running.

Thanks also to my excellent advisor Alexander Bock for many late hours of

support and idea discussions. Your willingness to help and share your vast

graphics knowledge has been truly invaluable.

Thank you Masha for your tireless and dedicated work with CCMC and for

taking care of us thesis students. Your genuine interest in the project is a

requirement for its success! I’m sure the next couple of students will feel

just as welcome. Thank you Carter for keeping us busy and for the great

private tour of the museum. Bob, thank you for keeping an eye on the big

picture!

Aki, thanks for making my commute shorter, my lunches more tasty and

my country music knowledge more solid. Nate, thanks for being a bro and

thanks Avery for letting me sleep on your floor for a while. Come to Sweden

and I’ll repay the favors! Thanks Martin for doing a great job during the

first stage of the project and thereby making my job easier. Thanks to my

many different roomates and friends in Washington D.C. for making my

stay so much more than only work. I hope to see many of you again soon!

Many thanks to Holmen AB, Sparbankstiftelsen Alfa and Stiftelsen Anna

Whitlocks Minnesfond for the financial help when CSN wouldn’t lend me

more money. I could have not completed my stay without it.

Finally, thanks to my family for the endless support and encouragement!

Victor

Stockholm, February 2014

ii

Contents

1 Introduction 1

1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Aim and Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.5 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Background 5

2.1 Space Weather . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Community Coordinated Modeling Center . . . . . . . . . . . . . . . . . 6

2.3 Open Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Previous Work 9

3.1 Volume Ray-Casting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2 TSP Tree Acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Overview and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Traversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Error Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.3 Rendering of Large Voxel Datasets . . . . . . . . . . . . . . . . . . . . . 13

Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4 Pipeline Overview 15

4.1 Pipeline Stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.2 Inputs and Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

iii

CONTENTS

5 TSP Tree Implementation 17

5.1 Bricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

5.2 Separation of Structure and Data . . . . . . . . . . . . . . . . . . . . . . 17

5.3 Memory Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

5.4 Error Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

5.5 Pointer Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.6 Traversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

6 Data Formatting 23

6.1 Space Weather Data Sources . . . . . . . . . . . . . . . . . . . . . . . . 23

ENLIL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

CDF Data Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Kameleon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

6.2 Furnace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

6.3 Voxel Data Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

7 Data Pre-Processing 27

7.1 Forge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

7.2 TSP Tree Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Brick Padding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Octree Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

BST Assembling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

7.3 TSP Data Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

8 Rendering 31

8.1 Flare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

8.2 TSP Structure and Error Metrics Construction . . . . . . . . . . . . . . 32

TSP Structure Construction . . . . . . . . . . . . . . . . . . . . . . . . . 32

Error Metrics Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . 32

Error Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

8.3 Intra-Frame Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

View Ray Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

TSP Tree Probing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Brick Uploading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

iv

CONTENTS

Ray-Casting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

8.4 Asynchronous Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

8.5 Rendering Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

8.6 Cluster Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

SGCT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

9 Results 39

9.1 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

9.2 Rendering Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

9.3 Error Metrics Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . 39

9.4 Visual Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Desktop Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Dome Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

10 Discussion and Future Work 45

10.1 Visual Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

10.2 Interactivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

10.3 Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

10.4 TSP Tree Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

10.5 Data Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

10.6 Error Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

10.7 Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

11 Conclusions 51

References 53

v

CONTENTS

A Code Samples 55

A.1 TSP Tree Traversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

A.2 Brick Padding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

A.3 Octree Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

A.4 BST Assembling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

A.5 TSP Tree Structure Construction . . . . . . . . . . . . . . . . . . . . . . 69

A.6 Error Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

A.7 Rendering Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

vi

1

Introduction

This first chapter briefly discusses the background and goals of the work. It also

describes used methods as well as the thesis’s structure and limitations. Note that the

background is described in more detail in chapter 2.

1.1 Background

Open Space is the working title for a project initiated in the fall of 2012. Collaborators

in the project are Linkoping University, the Community Coordinated Modeling Cen-

ter (CCMC) at the National Aeronautics and Space Administration (NASA) and the

American Museum of Natural History (AMNH). The long-term goal of Open Space is

an open-source scientific visualization software with focus on space-related data sources.

This software will be capable of producing efficient, accurate and beautiful visualiza-

tions of phenomena on a scale ranging from the size of atoms to the size of the entire

known universe. The uses for this software will be both scientific as well as for public

dissemination.

In order to accomplish this goal, the participants are engaged in a Master’s thesis

student project. This collaboration enables students from Linkoping to be on-site at

NASA Goddard Space Flight Center, working close to the NASA scientists and the data

sources. Input and feature requests for the project come from all three stakeholders,

giving the project a broad purpose that is rooted in computer graphics research, in

space science and in multimedia.

1

1. INTRODUCTION

The part of the Open Space project described in this thesis aims to efficiently render

time-varying data sets of space weather using volumetric voxel rendering.

1.2 Aim and Goals

One of the most challenging aspects of volumetric rendering is handling large data

sets efficiently. Time-varying data sets provide additional challenges due to memory

limitations and the need to update the rendering often in order to achieve an animation

with an acceptable frame rate.

The aim of the work presented within this thesis is to implement an efficient volu-

metric rendering pipeline, capable of handling large time-varying data sets. The results

of this work will later be implemented in the larger-scale Open Space project. Addi-

tionally, the implemented rendering system will have enough functionality to provide

visualizations that can be used in presentations, videos et cetera.

1.3 Method

The thesis work will be carried out by implementing a volumetric rendering system from

the ground up. The input to this rendering system will be data from space weather

simulations. The system will be continuously improved as the work develops. Having

a basic functionality working early enables an iterative approach, and makes modular

implementation and testing easier as more advanced features are implemented.

1.4 Limitations

Since the thesis focuses on the rendering efficiency and the pipeline, less focus will be

put on the space weather application domain. Although the software will be capable of

rendering arbitrary volumetric data provided the right preprocessing steps are taken,

a smaller amount of time is spent on the applications than in the previous prototyping

phase (see section 2.3).

For the same reasons, the rendering techniques are very simple compared to what

is possible today.

2

1.5 Thesis Structure

1.5 Thesis Structure

To properly familiarize the reader with the subject and the project that this thesis

work is a part of, the thesis will start with a brief section on space weather and some

of the background of the collaboration. Then some previous work on Open Space and

computer graphics will be presented, before describing the implemented pipeline. The

chapter “Pipeline Overview” does not go into any implementation details, but is very

useful for putting the subsequent chapters in context. After the high-level overview

some time is spent on describing the implementation one of the main techniques, the

Time-Space Partitioning (TSP) tree. These methods are used in many parts of the

pipeline, and are therefore also presented early in the thesis. Following the introductory

chapters are the three chapters that each describe a different part of the pipeline.

Results, future work topics and discussion of the work are presented last in the main

part of the thesis. To further explain some of the implementation thesis, an appendix

with selected code samples is included in the back.

3

1. INTRODUCTION

4

2

Background

This chapter will provide context to the thesis by outlining the Open Space project,

and breifly discussing what space weather is and how it is being studied.

2.1 Space Weather

The National Research Council explains the concept of space weather in the following

way (1).

“Space weather” describes the conditions in space that affect Earth and

its technological systems. Our space weather is a consequence of the behav-

ior of the sun, the nature or Earth’s magnetic field and atmosphere, and

our location in the solar system.

The National Space Weather Program Council has a similar description of the

subject and also mentions the effects that space weather can have on earth (2):

“Space weather” refers to conditions on the sun and in the solar wind,

magnetosphere, ionosphere, and thermosphere that can influence the perfor-

mance and reliability of space-born and ground-based technological system

and can endanger human life or health. Adverse conditions in the space

environment can cause disruptions of satellite operations, communications,

navigations, and electric power distribution grids, leading to a variety of

socioeconomic losses.

5

2. BACKGROUND

2.2 Community Coordinated Modeling Center

Given the possible effects on earth, it is desirable to study and predict space weather

events. The Community Coordinated Modeling Center (CCMC) at NASA Goddard

Space Flight Center works with space weather simulation and forecasting. The center

also provides the scientific community access to the models and resources for develop-

ment and research.

2.3 Open Space

The prototyping phase of the Open Space project resulted in a thesis by Tornros (3).

This work contains a thorough summary of the modeling and simulations tools used

at CCMC, as well as an overview of pre-existing visualization software. The thesis

also presents an approach for visualizing space weather data by means of volumetric

rendering and ray-casting. An open-source software for interactive volume rendering,

Voreen (4), is used and extended to produce interactive renderings of space weather

events. These renderings are done for one time step at a time. Screenshots from

Tornros’ thesis can be found in figures 2.1 and 2.2.

The results of this prototyping phase provide an entry point for the work described

in this thesis, where the goal is to enable working with time-varying data sets.

6

2.3 Open Space

Figure 2.1: Screenshot of a coronal mass ejection event visualization

Figure 2.2: Screenshot of the Voreen workspace

7

2. BACKGROUND

8

3

Previous Work

The Previous Work section of the report provides a theoretical background of the

techniques and concepts used in the implementation, mainly related to volumetric

visualization and rendering of large voxel datasets.

3.1 Volume Ray-Casting

Volume ray-casting is an image order volume rendering technique. This means that

the image is produced by iterating over pixels rather than iterating over objects in the

scene. To determine the color of each pixel, view rays are sent from the position of

the camera through the volume (figure 3.1), and the volume is sampled at points along

these rays.

As the samples along each ray are gathered, each intensity is mapped to an RGBA

color using a transfer function (figure 3.2). The colors from the transfer function

mappings are composited into the final ray color using front-to-back compositing. The

equations to calculate the composited color and opacity C ′ and A′ given the accumulated

values and the mapped color and opacity C and A are given in equation 3.1.

C ′

i = C ′

i−1 + (1−A′

i)Ci

A′

i = A′

i−1 + (1−A′

i)Ai

(3.1)

9

3. PREVIOUS WORK

Figure 3.1: The concept of volume ray-casting. View rays are shot from a virtual eye/-

camera position through the image plane.

Figure 3.2: Top: The volume is sampled along the view ray. Bottom: Each sampled

intensity is mapped to a color using a transfer function.

10

3.2 TSP Tree Acceleration

3.2 TSP Tree Acceleration

While straight-forward rendering techniques can be adequate for small data sets, they

are often not efficient enough for large amounts of data with high requirements on

speed. Researchers in the field of volumetric rendering and 3D graphics in general con-

tinuously strive to improve efficiency in data handling by the use of various acceleration

structures.

One such structure is called a Time-Space Partitioning (TSP) tree, and is the chosen

data structure for this work. Implementation details are described in chapter 5.

Overview and Motivation

The TSP tree was first introduced by Shen et. al (5) and was later improved by

Ellsworth et al (6). It is designed to capture and exploit both temporal and spatial

coherency in a time-varying data set. The tree traversal algorithm uses user-supplied

error tolerances to choose the correct level of detail at runtime. By separating the

time domain from the spatial domain and treating them differently, the scheme can

efficiently handle data sets where there is a large discrepancy between the resolutions

in those domains. Error metrics are stored in the tree nodes, and the tree can be built

once and then used repeatedly.

Structure

A TSP tree uses a complete octree as a skeleton. This octree subdivides the volume

until a certain spatial subdivision level has been reached. Each octree node (inner

nodes as well as leaves), in turn contains a binary search tree (BST) that contains the

temporal information for that spatial subdivision. The binary search tree leaves are the

individual time steps, and each level above the leaves represents a time span of twice

the length. The binary search tree roots represent the whole temporal extent. In other

words, the search tree roots represent averages of the octree nodes’ values over all time

steps. The overall structure is illustrated in figure 3.3.

Traversal

TSP tree traversal starts in the octree. For every octree node, the corresponding BST

is traversed top-down until a node with satisfying error metrics is found or a leaf

11

3. PREVIOUS WORK

Figure 3.3: TSP structure (illustrated using a quadtree). The example uses two spatial

subdivisions and eight time steps. The top section represents the octree skeleton, and the

bottom tree is the binary search tree for one of the octree nodes.

12

3.3 Rendering of Large Voxel Datasets

is reached. If the error at the leaf is too big, the traversal continues with the next

subdivision level in the octree. See section 5.6 for the full TSP traversal algorithm.

Error Metrics

The concept of error metrics is key in the use of TSP tree techniques. To separate the

spatial and temporal domains, two different error metrics are used by the TSP tree

algorithm. The spatial error indicates how coherent the voxels within a subvolume are,

and the temporal error is a measure of how coherent the voxels between two or several

time steps are. The first TSP tree publication (5) uses error metric based on the scalar

values of voxels. To make the error metrics more accurate and closely related to the

visible image, a color-based approach was introduced (6). The color-based approach is

useful for any image where mapping from scalar values to colors are used, for example

when using transfer functions.

3.3 Rendering of Large Voxel Datasets

One of the more prominent works in the field of voxel graphics is the GigaVoxels

system, presented in the Ph.D. thesis by Crassin et. al (7). Their work outlines an

extensive pipeline for handling very large sets of data. While the thesis extensively

covers the subject of turning traditional scene to voxels (in contrast of working directly

with voxels), a sophisticated pipeline for processing the data has been developed. This

pipeline mainly deals with transferring data between system and video memory through

a custom GPU paging system.

Data Structure

GigaVoxels makes use of a spatial, octree-based structure for hierarchical space sub-

division. The smallest entities, the octree nodes, in this subdivision are called bricks,

small voxel grids that represent the volume’s subdivision at a given level. Bricks make

it possible to combine efficient 3D texture features of GPUs and give good flexibility

in the subdivision.

Furthermore, GigaVoxels uses brick pointers. These pointers are separated from the

data itself, only pointing to the original data in a brick pool. The hierarchical structure

13

3. PREVIOUS WORK

Figure 3.4: Screenshot from the GigaVoxels system. Image from

http://gigavoxels.inrialpes.fr/

can be easily represented by using these brick pointers rather than the full data, making

the traversal of the structure much faster.

Rendering

The rendering algorithm in GigaVoxels is split into two passes, both done by one big

GPU kernel. One pass traverses the octree top-down, and the other samples the volume.

The level of detail (subdivision level in the octree) is chosen during the traversal step

and is based on the projected size of the voxels on the screen.

The system is built for large datasets where the whole scene is far too big for video

memory. A GPU paging system loads data on-the-fly, getting requests by an ongoing

ray sampling pass. A caching system keeps track of the recently used bricks, making

room for new bricks in GPU memory when needed.

14

4

Pipeline Overview

The implemented pipeline consists of three main stages called Furnace (data format-

ting), Forge (data pre-processing) and Flare (rendering). These three stages are de-

scribed in detail in chapters 6, 7 and 8. This chapter provides a high-level overview of

the whole pipeline.

4.1 Pipeline Stages

Figure 4.1: Overview of the three pipeline stages Furnace, Forge and Flare

The three parts of the pipeline do not interact with each other during runtime, and

each stage is run separately. The separation and encapsulation provide efficient and

customizable processing of the data, in which each phase refines and prepares the data

15

4. PIPELINE OVERVIEW

for rendering.

Furnace funnels various external data sources into a format that the rest of the

pipeline can use. Forge takes the output from Furnace and builds the tree structures

that Flare then uses during rendering, which is the final step.

4.2 Inputs and Outputs

To achieve encapsulation between the parts of the application, the preparation stages

Furnace and Forge output binary files on disk. This means that as soon as a previous

step is completed, the next step only needs the produced file and its structure.

The input to the first step, Furnace, is the volume data to render. This data could

be formatted and delivered in any way, so a Furnace module for each data source needs

to be written. Furnace then outputs a file where the voxel data for all time steps is

saved. Forge uses the straight-forward time-varying volume data and builds the TSP

tree structure and saves it to a new, separate file. The same input data can be used

to produce different configurations of the tree. Flare uses one of the TSP tree files and

renders it.

16

5

TSP Tree Implementation

Before going into the details of the pipeline stages, it is useful to know how the TSP

tree is implemented and how its structure affects different approaches and techniques

throughout the software. This chapter describes some implementation details in the

TSP tree usage.

5.1 Bricks

A tree structure where the leaves are individual voxels would induce a very large over-

head. For this reason, the smallest element in the tree is a brick. Bricks are subvolumes

of voxels. As an example, a volume of 128× 128× 128 voxels using 16× 16× 16 bricks

would have room for 128/16 = 8 bricks per axis. Brick usage is a common concept and

is used by both the TSP (5, 6) and the GigaVoxels (7) authors.

5.2 Separation of Structure and Data

The use of bricks keeps the tree compact enough to store the whole structure in GPU

memory during rendering, since the individual voxels are not referenced. The tree

structure is separated from the raw voxel data. The tree only keeps track of the brick

numbers that correspond to the bricks saved on disk. The nodes in the structure store

the brick number and the number of the node’s child along with error metrics. This

approach is similar to the one described by Crassin et. al (7).

17

5. TSP TREE IMPLEMENTATION

5.3 Memory Layout

The TSP tree was originally described as an “octree of binary search trees”, meaning

that each node an octree skeleton contains a binary search tree each, whose nodes in

turn contain the data. The implementation in this thesis uses that structure (described

in chapter 3) during traversal, but the brick numbering and data ordering that get

saved to disk are slightly different. To enable efficient sequential loading of bricks

during rendering (chapter 8) the data is instead ordered so that the nodes of the

octrees are saved next to each other. This leads to a different pointer structure, and a

structure that can be viewed as a “binary search tree of octrees”. The brick structure

is illustrated in figures 5.1 and 5.2.

Figure 5.1: Conceptual tree layout, differing from the structure originally described. The

layout can be thought of as a binary search tree of octree. The figure uses a quadtree for

illustration.

The main benefit of this approach is that bricks among the same BST and octree

levels will have consecutive brick numbers. As will be discussed in chapter 10, the

rotating nature of a Coronal Mass Ejection (CME) event simulation makes spatial

filtering more useful than temporal filtering in the current implementation. This means

that during runs, bricks will often be chosen from the same node in the BST tree.

Additionally, it is probable that bricks close to each other in the octree will be used

simultaneously. By storing the bricks in one temporal filtering step together, the two-

stage rendering scheme (see chapter 8) can first identify a sequence of bricks and the

uploading stage can use a single read operation to read these from disk, rather than

having to fetch these bricks from different parts of disk memory.

18

5.4 Error Metrics

Figure 5.2: Memory layout of the TSP tree, using quadtrees instead of octrees. Numbers

correspond to brick indices.

5.4 Error Metrics

The original TSP tree paper by Shen et. al (5) uses the coefficient of variation to

indicate the error for a brick. This coefficient is defined as the standard deviation σ

over the average µ. Shen et. al implement this by first calculating the average voxel

value for each brick, calculating the standard deviation in the same brick, and finally

producing the said ratio.

This approach seems to have several problems. It is desirable to get a higher error

the further from the original data (the leaves) in an octree we get, but the standard

deviation gets lower and lower the more averaged (filtered) the higher-up bricks get.

Additionally, averaging with the mean produces very large and varying values when

the average gets close to zero. In large and empty areas of the volume the error should

be very low, but dividing with small values is very unstable. For said reasons, the error

metric calculation has been slightly modified.

The spatial error is calculated by first calculating the average vbrick for each brick,

where n is the number of voxels in a brick (equation 5.1).

vbrick =1

n

n−1∑

i=0

vi (5.1)

Instead of the standard deviation, a modified version is calculated. This is done by

comparing the brick average with the voxel values in the leaf bricks that are covered by

this particular brick. The equation for the modified spatial error (5.2) is similar to a

19


regular standard deviation calculation. m stands for number of covered bricks and n

stands for the number of voxels per brick.

espatial =

√

√

√

√

1

m · n

m−1∑

j=0

n−1∑

i=0

(vi,j − vbrick)2 (5.2)

The temporal error metric is calculated by first calculating the average value over

time for each voxel (at positions i). See equation 5.3 where l is the number of time

steps.

vi =1

l

l−1∑

t=0

vi,t (5.3)

These values are then used to calculate the average modified standard deviation

per voxel. Subsequently the voxel standard deviations are averaged per brick. This

average is the temporal error metric, illustrated in equation 5.4, where n is the number

of voxels per brick and m is the number of covered leaf bricks. Implementation details

can be found in section 8.2.

etemporal =1

n

n−1∑

k=0

√

√

√

√

1

m

m−1∑

j=0

(vi − vi,j)2 (5.4)

5.5 Pointer Structure

To save space, each tree node stores only one child pointer. The pointer can have

different meaning to enable traversal of both the octrees and the overall binary search

tree. If the binary search tree is the root, the child pointer is used to access the

octree. Otherwise, the child pointer points to the BST child node. The pointer usage

is implemented in the traversal scheme.

5.6 Traversal

The rendering algorithm (see section 8) uses two separate TSP tree traversal passes.

Both passes traverse the tree structure in the same way. For traversing the TSP tree,

the high-level approach suggested by Shen et. al (5) is used. A flowchart for the overall

TSP tree traversal can be found in figure 5.3.

20

5.6 Traversal

Figure 5.3: Flowchart of the TSP tree traversal algorithm. OT - Octree, BST - Binary

search tree.

21


For the internal octree traversal, a modified version of the KD-restart algorithm by

Horn et. al (8), modified for octrees, is used. This algorithm is stackless, which is very

useful when traversing a structure on a GPU with limited memory and stack depth

capabilities.

The hybrid tree traversal implementation can be found in the code samples, section

A.1.

22

6

Data Formatting

The first pipeline stage, Furnace, extracts volumetric data from various data sources

and produces a format that the subsequent stages use. This format is designed to be

very general and its main purpose is to act as an abstraction layer, leaving optimizations

to procedures later in the pipeline.

6.1 Space Weather Data Sources

While the Open Space software will be capable of handling a large variety of data

sources, space weather related data sources are used throughout the thesis work. This

benefits CCMC, and is a natural continuation of the first stage of the project.

ENLIL

The main data source for this project is the ENLIL model by Xie et. al (9). The model

describes the heliosphere in terms of plasma mass, momentum and energy, among other

variables. ENLIL is used to describe Coronal Mass Ejection (CME) events.

Simulations using this model can be accessed from the CCMC web site (10), and

this project has mainly used a run titled Hong Xie 120312 SH 1 during development.

CDF Data Format

The CCMC space weather event data from simulations is stored in the standardized

CDF (Common Data Format) file format. CDF files store the large number of variables

that the simulations run generate as well as additional metadata.

23

6. DATA FORMATTING

Kameleon

The tool Kameleon (11), developed and maintained at CCMC, is made to extract data

from the CDF files. The software acts as an abstraction layer between the model

data and applications. Kameleon provides access as well as interpolation, allowing

applications to extract spatial and temporal data at arbitrary points.

6.2 Furnace

The very first step in the pipeline is formatting the data. Furnace takes care of this task

using custom modules for the different data sources. These sources include ENLIL data

in the CDF format, and the module to handle ENLIL uses the Kameleon to extract the

chosen data. Furnace is configured using a few basic parameters: The location of the

input and output, type of data source and desired dimensions of the output volume.

Figure 6.1: Schematic overview of the data formatting stage Furnace

6.3 Voxel Data Format

The output from Furnace is called Voxel Data Format (VDF). The volume data is

represented by floats and is ordered by time steps. The voxels in each frame are ordered

by indices, given by equation 6.1, where xDim, yDim and zDim are the number of voxels

along each axis in the volume.

ix,y,z = x+ y · xDim+ z · yDim · zDim (6.1)

24

6.3 Voxel Data Format

The data is stored in a binary file along with some header data. The header data

describes the type of coordinates (currently Cartesian or spherical), the dimensions and

the number of time steps. The VDF file format is described in table 6.1.

Data field Representation

Grid type unsigned integer

Number of time steps unsigned integer

x dimension unsigned integer

y dimension unsigned integer

z dimension unsigned integer

Voxel data float

Table 6.1: VDF data format

25

6. DATA FORMATTING

26

7

Data Pre-Processing

The data needs to be re-formatted from the straight-forward structure into a TSP tree.

The second pipeline stage, Forge, takes care of this process and outputs files which the

rendering stage can use directly.

7.1 Forge

The input is a file in the VDF format described in chapter 6. Forge can be customized to

output TSP trees with different brick sizes. The desired brick size is the only parameter

to Forge, apart from input and output file names.

Figure 7.1: Overview of Forge

27

7. DATA PRE-PROCESSING

7.2 TSP Tree Construction

The TSP tree construction process consists of several stages. The stages use temporary

files as storage where possible to avoid problems when building trees which are too large

to keep in memory at once.

Brick Padding

Bricks will eventually be stored in a 3D texture before getting rendered (see section

8.3 for details). This texture, the brick atlas, does not keep the spatial information

intact. The bricks may be uploaded in any order, and therefore a standard 3D texture

interpolation will fail when sampling close to brick borders. To handle this, each brick

gets padded with a layer of voxels from its spatial neighbors before getting put in the

tree structure. The padding is carried out in two steps, illustrated in figure 7.2:

1. Add an extra layer of voxels around the whole volume, copying the closest border

voxel. This provides neighbors for the bricks on the border.

2. Treat each brick in isolation, and add an extra layer of voxels around each of the

bricks. The added voxels are copies of the neighboring voxels in the volume. Note

that this step is only done for the original voxels, not the extra layer we added in

the previous step.

Figure 7.2: Example of (2D) padding. Original volume is 4 × 4 with 2 × 2 bricks. The

resulting volume (right) is 8 × 8 with 4 × 4 padded bricks, getting the padding from the

neighboring pixels or the added outer border.

28

7.2 TSP Tree Construction

The sampling scheme makes sure that samples are always taken inside or on the

border of the original bricks. This ensures that correct interpolation can be done, since

the neighboring voxels will be from the original data.

The downside with the approach is the added amount of voxels. In figure 7.1 are

some typical brick sizes, and a comparison between the unpadded and the padded voxel

count. As can be seen, padding with an extra layer of voxels results in a significant

voxel increase for a 256×256×256 volume with 256 timesteps. For the 8×8×8 bricks

case, padding means almost a doubling of the volume size. For larger brick sizes, the

added overhead is smaller. Sample code for the brick padding can be found in section

A.2.

Brick size Unpadded voxel count Volume size with padding Padded voxel count

8× 8× 8 9.797.856.768 320× 320× 320 19.136.439.000

16× 16× 16 9.797.595.136 288× 288× 288 13.950.091.512

32× 32× 32 9.795.502.080 272× 272× 272 11.749.341.240

Table 7.1: Comparison of unpadded and padded voxel counts for a 256×256×256 volume

with 256 timesteps

Octree Construction

The first step in building the TSP tree is building one full octree from each time step

in the input data. The octree construction is done by first rearranging the data into

bricks of the chosen size and padding them (see previous section). These bricks are

then given a new index using Z ordering (12). The Z-order (or Morton order) number

arranges the bricks so that the nodes that will make up the children of a higher level

are ordered next to each other. The octree is then built from the bottom up, averaging

the bricks in groups of eight to build the parent nodes of the higher levels. The octree

construction process is illustrated in figure 7.3.

Each octree is saved to a separate file on disk, avoiding limitations on memory.

Sample code for the octree construction can be found in section A.3.

BST Assembling

When the octrees are built, they get assembled into the “binary search tree of octrees”

described in chapter 5. This process is relatively simple. First, the leaf level of the BST

29

7. DATA PRE-PROCESSING

Figure 7.3: Octrees are built by first giving the bricks a new number, and then averaging

nodes to build higher levels until the root is constructed.

(corresponding to individual time steps) is constructed by using the individual octrees

as leaves. Then, the higher levels are built by averaging the two octrees blow, so that

each higher step represents a time span of twice the length of the spans on the lower

level. This process is repeated until the root BST node has been constructed.

Sample code for the BST assembling can be found in section A.4.

7.3 TSP Data Format

The output from Forge, TSP files, adds a few header fields to the format inherited from

Furnace. The additional values concern brick dimensions. The TSP format is described

in table 7.2.

Data field Representation

Grid type unsigned integer

Number of time steps unsigned integer

x dimension unsigned integer

y dimension unsigned integer

z dimension unsigned integer

x brick dimension unsigned integer

y brick dimension unsigned integer

z brick dimension unsigned integer

Number of bricks along x unsigned integer

Number of bricks along y unsigned integer

Number of bricks along z unsigned integer

Voxel data float

Table 7.2: TSP data format

30

8

Rendering

The final piece of the pipeline is the rendering. The software producing the final

renderings is called Flare, and represents the largest of the three stages.

8.1 Flare

Flare renders TSP files from the pre-processing step. The renderings are customized by

choosing a number of parameters and a transfer function, both topics described later

in this chapter.

Figure 8.1: Overview of Flare.

31

8. RENDERING

8.2 TSP Structure and Error Metrics Construction

The overall details of the TSP tree structure and the error metrics can be found in

chapter 5. This section further describes some of the implementation techniques and

refers to source code samples.

TSP Structure Construction

The TSP structure used for traversal is kept in memory, both on the host and the GPU.

It is not explicitly stored on disk, so it needs to be constructed before the rendering

loop can be initiated. The construction is relatively quick and traverses the whole

brick structure on disk once, keeping track of child indices and allocating space for

error metrics (to be calculated in the next steps). Code for the construction function

can be found in section A.5.

Error Metrics Calculation

The spatial error calculation runs in two passes. The first pass calculates the average

color for each brick, and the second pass compares the brick average to the leaves that

the current brick covers.

The temporal error calculation also uses several passes. The first pass is run to keep

track of each voxel’s average value over time. Then the modified standard deviation is

calculated per voxel and then averaged over bricks.

The error calculation can be omitted. If the user wants no errors, the calculation

step is skipped and traversal will always reach the leaves.

Equations for error calculation can be found in section 5.4, and code samples in

section A.6 of the appendix.

Error Caching

Since the current implementation of the error metrics only depends on the voxel data,

the calculated error for a TSP file can be reused. Flare saves the error metrics to a file

which is read before subsequent renderings. The file is small and reading is fast. The

simple caching approach enables pre-calculation of the error metrics. It is separated

from Forge since the user may want to use different kinds of error metrics. In particular,

color-based error metrics that depend on the current transfer function rather than the

32

8.3 Intra-Frame Pipeline

raw intensity values (6). Such error metrics need to be re-calculated at every change

of transfer function, and the mechanism therefore belongs in Flare.


The rendering algorithm is two-staged. The data needs to be fetched from disk and

uploaded to GPU memory, and a TSP tree probing step is responsible for requesting

the right bricks to upload. When the bricks are uploaded, the ray-casting step renders

the images. This section describes the intra-frame steps taken in detail.

View Ray Generation

When the model, view and projection matrices are updated, it is time to calculate the

direction of the view rays. The algorithm used for this was proposed by Kruger and

Westermann (13) and relies on rendering a colored cube. The volume to be rendered is

bounded by a cube with its opposing corners in (0, 0, 0) and (1, 1, 1), respectively (see

figure 8.2). The cube is colored by letting each corner vertex also represent a color.

The aforementioned vertices therefore represent a black and a white corner.

Figure 8.2: Bounding cube vertices

A simple GLSL shader interpolates the corner colors across the faces of the cube,

resulting in a fully colored cube where the color in a point on the surface also represent

the point’s position in space. By rendering this colored bounding cube twice, one time

with back face culling and one time with front face culling (figure 8.3), the view rays

33

8. RENDERING

can be calculated. Given coordinates on the view plane, the direction of a view ray is

calculated by taking the difference between the the color of the front facing point and

the color of the back facing point (figure 8.4).

Figure 8.3: Bounding cubes with interpolated colors. Left: Front faces. Right: Back

faces.

Figure 8.4: Example of entry and exit point samples and resulting ray direction.

TSP Tree Probing

The data resides on a file on disk and needs to be uploaded to the GPU memory before

the rendering can take place. To consolidate the uploading of all bricks into one single

request and thereby minimize transfer overhead, a probing step is used. The TSP tree

probing is a dry-run of the rendering, where the result is a brick request list rather

than a rendered image. This probing uses the same view rays and the same traversal

34


algorithm as the subsequent rendering. Rays are shot through the volume, and every

time a brick with acceptable error metrics (or a leaf) is found, the responsible OpenCL

kernel increases a value in an array where the indices correspond to brick indices. After

the probing, the bricks that will be needed have a count that is higher than zero. This

process is illustrated in figure 8.5

Figure 8.5: Brick counts before and after a tree probing step. Initially, all counts are

set to zero. After the probing step, bricks which will be used during rendering will have a

count higher than zero. The example uses only two view rays for simplicity and does not

show the tree traversal process.

Brick Uploading

The brick request list generated by the probing step is scanned, and every brick that

has a count higher than zero is put into a brick list. While this brick list is built, every

added brick also gets a coordinate in the brick atlas, the 3D texture that will hold the

uploaded bricks. This coordinate is saved in the brick list and thereby maps every

brick to a unique atlas lookup position. Note that the atlas coordinates do not have

any spatial meaning, it is only a way to keep track of where the rendering kernel will

fetch the data.

The data upload occurs in two steps. The data from disk is uploaded to an OpenGL

Pixel Buffer Object (PBO) that is mapped to system memory. The PBO corresponds

35

8. RENDERING

to the 3D texture that will store the atlas. The uploading is done by scanning the brick

list and placing each brick in the correct spot in the PBO. If the uploading algorithm

detects several consecutive bricks to be uploaded, those are read together to avoid disk

read overhead.

When the PBO is populated with the brick data, an OpenGL 3D texture is built

by copying the pixels from the PBO. The 3D texture is then ready to use by the GPU

rendering kernel. See figure 8.6 for a schematic overview of the brick upload process.

Figure 8.6: Brick uploading. The bricks in the brick list (top) are read from disk, copied

to the right position in the PBO in memory and then uploaded to the 3D texture on the

GPU.

Ray-Casting

The rendering kernel can be launched as soon as the 3D texture is ready. The rendering

process traverses the TSP tree in the same way as during probing (see section 3.2) and

with the same view parameters, thus visiting the same bricks. The sample position

(converted to spherical coordinates if needed) gets translated to the correct texture

atlas coordinates and samples the brick in that atlas cell. The samples are composited

in the manner described in section 3.1 to render the final color for the sampled view

plane coordinate.

36

8.4 Asynchronous Execution

8.4 Asynchronous Execution

The rendering of each individual frame must follow the logical order presented above,

but the pipeline’s bandwidth can be used in a more efficient way by interleaving the

rendering steps during two rendering iterations. Since different tasks in the pipeline

are handled by different parts of the system, parallel execution is important for perfor-

mance. In a simple model, some tasks are handled by the CPU and others by the GPU.

Figure 8.7 describes the order of which tasks are processed. Note that the figure does

not show the correct relation between executions times. See section 9.2 for measured

times, and section A.7 in the appendix for the complete rendering loop code.

Figure 8.7: Simplified overview of the interleaving of rendering steps. Each color repre-

sents a different time step. The figure shows two rendering iterations, where the frame at

t=2 (in green) gets completely processed at the same time as the neighboring frames get

finalized or initiated.

8.5 Rendering Parameters

The rendering application can be configured using a few different parameters. Below

is a list of these parameters and their meaning.

Local OpenCL work size (x and y) Changes the local work size for the OpenCL

probing and ray-casting kernels. Can be used to tune performance. See NVIDIA’s

OpenCL Best Practices Guide (14) for performance heuristics.

Texture division factor A higher factor than 1 decreases the output texture size,

thereby lowering the number of calculated view rays. This factor can be used to

easily give up quality to gain speed.

Spatial error tolerance Maximum tolerable spatial error.

Temporal error tolerance Maximum tolerable temporal error.

37

8. RENDERING

Error calculation (on/off) If turned off, the error calculation step is skipped and

tolerances are set to zero.

Probing step size Step size in the probing kernel.

Ray-casting step size Step size in the tay-casting kernel.

Ray-casting intensity A factor that the final colors get multiplied with. Used adjust

image brightness.

8.6 Cluster Rendering

Large dome theater displays often use a cluster of rendering computers and projectors

to be able to render on very large and curved screens. Such a cluster needs to be able

to divide the rendering work between its nodes, in such a way that each node renders

a portion of the scene without visible seams and artifacts.

SGCT

Simple Graphics Cluster Toolkit (SGCT) is developed at Linkoping University. It is a

cross-platform C++ library enabling graphics synchronization over a cluster of com-

puters. A rough and basic implementation of SGCT is used as the rendering system

for the work presented in this thesis, enabling dome rendering and stereography as well

as standard desktop rendering.

38

9

Results

9.1 Hardware

Development, rendering and testing have been carried out on a standard desktop com-

puter, equipped with the following hardware:

• 16 GB system memory

• SATA 2.0 SSD drive

• GPU: GeForce GTX 690, two cores with 2 GB RAM each, PCI Express 3.0

9.2 Rendering Benchmarks

Table 9.1 shows a selection of benchmarks of the different steps taken in the rendering

loop. Each measure has been determined by averaging a number of runs. The total

rendering loop time has been measured when utilizing the asynchronous execution

of the rendering, while the other time benchmarks have been measured individually.

9.3 Error Metrics Benchmarks

To benchmark the efficiency of the error metrics approach, three levels of error have

been determined. These three levels are labeled as no, low and high error. The errors

have been determined using a combination of subjective visual quality and looking at

approximately how many bricks (out of the total amount in the volume) being used

and/or cached while rendering. Both temporal and spatial errors have been accepted at

39

9. RESULTS

A B C D E F G H I J

128 128 0.02 0.01 0.07 0.019 0.028 0.05 0.00005 0.0002

128 128 0.04 0.03 0.07 0.017 0.010 0.05 0.00005 0.0002

128 16 0.02 0.01 0.14 0.023 0.062 0.10 0.010 0.0002

128 16 0.04 0.03 0.12 0.017 0.022 0.10 0.010 0.0002

128 32 0.02 0.01 0.10 0.016 0.044 0.07 0.013 0.0002

128 32 0.04 0.03 0.08 0.020 0.018 0.07 0.013 0.0002

256 32 0.02 0.01 0.50 0.036 0.069 0.42 0.010 0.0002

256 32 0.04 0.03 0.49 0.016 0.027 0.42 0.010 0.0002

256 64 0.02 0.01 0.41 0.023 0.056 0.39 0.013 0.0002

256 64 0.04 0.03 0.41 0.022 0.024 0.39 0.013 0.0002

A: Number of voxels per axis in full volume

B: Number of voxels per axis in bricks

C: Probing step size

D: Ray-caster step size

E: Total rendering loop time in seconds

F: Probing kernel execution time in seconds

G: Ray caster kernel execution time in seconds

H: Disk to PBO upload time in seconds

I: Read brick request list and build brick list time in seconds

J: Other render loop steps (proxy geometry, textures et cetera) in seconds

Table 9.1: Rendering loop benchmarking. Measurements made while rendering 256 time

steps of an ENLIL model.

the low and high error level. The low error level corresponds to a relatively poor visual

results, but still usable under some conditions. The high setting produces renderings

with very large artifacts and can only be used for benchmarking. The result is showed

in table 9.2.

The no error level uses 100% of the bricks. The low and high levels use approxi-

mately 75% and 40% of the bricks, respectively.

9.4 Visual Results

This section shows samples from interactive renderings of a CME event.

40

9.4 Visual Results

Brick size Error level Total render loop time (s)

16 no 0.12

16 low 0.16

16 high 0.07

32 no 0.10

32 low 0.07

32 high 0.05

Table 9.2: Error metrics benchmarking. Measurements made while rendering 256 time

steps of an ENLIL model with 128 voxels per axis.

Desktop Rendering

In figure 9.1 are three renderings of the same sequence, each using a different transfer

function. The three screenshots from each sequence are from the beginning, middle

and end of the visible CME event.

Dome Rendering

The Hayden Planetarium at the American Museum of Natural History in New York,

USA, inhabits an immersive fulldome theater. This theater is used for high quality space

productions, both pre-rendered and interactive. AMNH is an important collaborator

in the Open Space project, and a test run of a cluster implementation of the rendering

software was successfully carried out on-site in the planetarium. See figure 9.2 for a

photo of the occasion.

41

9. RESULTS

Figure 9.1: Rendering screenshots, each column with a different transfer function.

42

9.4 Visual Results

Figure 9.2: SGCT was used to enable this interactive space weather rendering at the

Hayden Planetarium at the American Musem of Natural History in New York, USA.

43

9. RESULTS

44

10

Discussion and Future Work

10.1 Visual Quality

Rendering

The visual quality of the renderings is adequate given the simple rendering approach.

While the produced images are correct and informative, it would be desirable to increase

the resolution of the volumes further, as volumes of 128 or 256 voxels per axis often

will be too low-resolution for real applications.

10.2 Interactivity

Performance

In an interactive application, performance is obviously crucial. A certain framerate

has to be reached to both give the user a good viewing experience, as well as making

interactions responsive. If the framerate drops too low, interactions will lose intuition

and usefulness. The performance measurements have shown that the application can

run on a consumer-grade desktop computer and reach good framerates for the used

volumes.

45

10. DISCUSSION AND FUTURE WORK

10.3 Pipeline

Encapsulation

The choice to split the software intro three individual parts has been one of the major

decisions in the development. While the pipeline has not yet been fully utilized by

trying different data sources, one of the requirements has been to handle a large variety

of data sources. The encapsulation provides a nice funnel with which new kinds of

volumetric data can be added without altering the tree structure or rendering. In the

same way, the rendering or the tree construction can be changed without worrying

about the other stages of the pipeline. In an experimental proof-of-concept application

like the one implemented, this has been very important.

10.4 TSP Tree Structure

Construction

The implementation of the TSP tree construction is relatively straight-forward. The

focus has been to produce correct and robust results rather than making the process

fast. This approach has also been taken while developing the use of several temporary

files on disk during the creation process. The technique effectively erases many concerns

related to memory availability when constructing potentially very large tree structures.

Naturally, the trade-off for this capability is speed. A great increase in efficiency could

be achieved by developing a more dynamic solution where fast system memory is being

used as much as possible, only switching to disk when needed. The algorithm could

also benefit greatly from parallelization, but that also requires not depending on the

slow disk read/write bandwidth for many operations.

Storage

Storing the raw data on disk and the tree structure in memory has proven to be an

efficient solution, used in projects of larger scale. The time spent on constructing the

structure and uploading it to the graphics card is very small compared to the transfer

of data or kernel execution, and the reads from the structure in the kernels are also a

very small and quick part of the algorithm.

46

10.5 Data Formats

The key to this approach is the use of bricks. The brick concept is fundamental

since it provides a way to make the tree structure several orders of magnitudes smaller

than the complete structure. It also provides a natural domain in which to filter and

calculate error metrics, as bricks have a spatial meaning. For caching purposes, it

is important that the bricks do not themselves need to know their place in the full

structure. This requires that the tree structure is kept in order at all time but, again,

the overhead of this structure is very small compared to the benefits of being able to

put bricks in arbitrary positions on the texture atlas that gets uploaded to the GPU.

10.5 Data Formats

The data format chosen for the implementation reflects the encapsulation in the pipeline,

being somewhat redundant and requiring careful structuring. As we have seen, the in-

dividual parts of the pipeline can only communicate using files, so it is very important

that the data formats are kept intact to avoid changes in many parts of the software.

Considering this, it would be desirable to further break out and abstract the data for-

mat definition outside the current pipeline and make the read and write operations

more flexible. For example, in a larger scale implementations, it needs to be easier to

add an extra variable to a header.

10.6 Error Metrics

Calculation

The visual and performance-related results have shown that the concept of error metrics

can be useful. An increased error tolerance yields a shorter rendering time as less

bricks need to be uploaded and traversed. Additionally, the way of calculating these

metrics does take spatial and temporal coherence into account. Spatial errors with less

variations do get more heavily filtered compared to areas with more changes, such as

the areas where a CME front develops.

However, there is much to improve in this area. Since the background winds in a

CME simulation are inherently rotating, using filtering in the temporal domain quickly

leads to very visible artifacts. The rotation of the magnetic fields needs to be smooth

for a good visual experience. While this rotation makes temporal filtering hard in the

47


current software state, there are very large gains to be made with a more sophisticated

implementation. If the movements of the background winds can be predicted, it would

be possible to reuse large portions of the data by merely changing the position accord-

ingly. There is often no need to update this data as it rarely changes intensity, only

position.

Another area that needs to be improved before the error metrics can be truly useful

is the calculation efficiency. As with the TSP tree construction, the implementation is

currently very straightforward and unoptimized. Traversing the tree structure several

times to calculate averages over both the spatial and temporal domains leads to an

unacceptable time complexity. Ellsworth et. al (6) show alternative implementations

of error calculations. However, that approach relies on errors based on color, which has

proven to be troublesome (see chapter 5).

As discussed in chapter 5.4, the original implementation has been altered. Working

with large areas with intensities close to zero leads to numerical problems. The same

type of problem arose when exploring the color-based approach. Using color as a

reference could be beneficial for visual results, but using color as ordinal values has its

drawbacks. Colors with small intensities (large, black areas) again lead to numerical

problems and inconsistencies. Ideally, the error metrics calculation should be able to

calculate coherency uniformly in dark as well as bright areas.

Control

The current error metrics implementation has two major drawbacks. Error tolerances

cannot be adjusted in real time, and measurements are not based on color. This means

that the efficiency might be visually very different for different transfer functions, and

also that it is very hard to see the effects fast. For future work, a further investigation

of color-based and real-time error metrics could prove very useful.

Choosing the correct brick size is very important, and the brick approach means

that a key trade-off has to be made. With small bricks, the filtering schemes and error

calculation is more fine-grained. Errors will be calculated for smaller areas and visual

artifacts may be smaller. On the other hand, the tree structure gets larger and the

traversal slower. The overhead from duplicating border bricks in the padding step also

gets larger, but that might not be a problem unless bricks get very small.

48

10.7 Rendering

10.7 Rendering

As the focus of this project has been put on the pipeline and the preparing stages,

the rendering technique can be improved substantially. While the data request and

rendering loop have been developed to fit the pipeline, the ray-casting rendering itself

is relatively unsophisticated. There are several ways to improve this. For example, a

volume sampling scheme capable of adapting to the detail level of the volume would

save a lot of samples and improve the visual quality.

It is very important to be able to integrate various kinds of data and objects in the

future Open Space project. Planets, spacecraft, text labels and field lines are just some

examples of items that could fit into a scene. This has to be taken into consideration

when further developing the rendering system and choosing the appropriate methods.

As these items, or any other phenomena to be rendered, can have very large variations

in scale, adaptiveness is important not just in a volume data set but for all kinds of

data in a scene.

The dual-loop implementation with one data request pass and the subsequent ren-

dering pass is in principle a simplified version of the advanced approach presented in

GigaVoxels (7). While GigaVoxels’s focus is shifted towards static (but large) scenes,

there are elements that could prove useful for future work. For example, the GigaVoxels

Cache Manager using a Least Recently Used updating policy could prove efficient in

combination with a further developed temporal caching approach. The data streaming

system in GigaVoxels also takes visibility into account. That is important for scenes

originating from mesh data, but not as crucial in volumetric scenes where the whole

volume is visible. On the other hand, a rendering scheme that can provide varying

levels of details is desirable. For example, spending less time on far-away voxels or

voxels that won’t contribute to an already saturated viewing ray would save a lot of

processing power.

When using relatively small bricks, the brick padding solution can mean a dou-

bling of the volume size. As smaller bricks might be desirable for fine grained error

calculations, one has to be careful when choosing the size of the bricks. The balance

between error metrics control, traversal speed and storage size must be adapted to each

application.

49


50

11

Conclusions

The renderings produced using the implemented system have been correct, useful and

running at interactive rates. The visual quality is good, while allowing for several fur-

ther improvements. The system can produce these images on a consumer-grade hard-

ware configuration as well in a clustered environment, showing flexibility and adaptabil-

ity. As shown before, volumetric rendering is very useful for visualizing space related

data in 3D.

For a larger-scale system, such as the future Open Space project, some important

areas of improvement can be summarized:

• The error metrics system needs to be more stable, efficient and intuitive. A

color-based, real-time solution would be desirable.

• The TSP tree solution can be useful after optimizing the construction stages and

the GPU traversal scheme.

• While the basic concepts of the brick uploading, caching and traversal work well, a

more mature system needs a dynamic approach where the system can seamlessly

switch between in-memory scenes and disk uploads could reduce overhead for

small scenes or systems with large amounts of memory available.

Equally important, there are also specific approaches that have shown to be useful:

• Encapsulation at logical places in the pipeline is important for flexibility and

adaptability. This allows certain areas to be improved or changed without affect-

51

11. CONCLUSIONS

ing the other parts of the system. It is important to decide on these stages early

in development.

• Using bricks for the data and brick pointer for the in-memory traversal enables

on-demand data uploads, something that is absolutely vital in large scenes that

do not fit into memory. The choice of brick size is very important, and balances

many performance aspects.

• Storing and building the tree structures on disk is important for letting the system

scale and handle very large amounts of data.

52

References

[1] Commitee on Solar and Space Physics and Comittee on Solar-Terrestrial Research.

Space weather: A research perspective, 1997. 5

[2] The National Space Weather Program Council. The national space weather pro-

gram - the strategic plan, 1995. 5

[3] M. Tornros. Interactive visualization of space weather data. Master’s thesis,

Linkoping University, Sweden, 2013. 6

[4] Jennis Meyer-Spradow, Timo Ropinski, Jorg Mensmann, and Klaus Hinrichs.

Voreen: A rapid-prototyping environment for ray-casting-based volume visual-

izations. In IEEE Computer Graphics and Applications, Volume 29, Number 6,

pages 6–13, 2009. 6

[5] Han-Wei Shen, Ling-Jen Chiang, and Kwan-Liu Ma. A fast volume rendering

algorithm for time-varying fields using a time-space partitioning (TSP) tree. In

Proceedings of the Conference on Visualization ’99: Celebrating Ten Years, VIS

’99, pages 371–377, Los Alamitos, CA, USA, 1999. IEEE Computer Society Press.

11, 13, 17, 19, 20

[6] David Ellsworth, Ling-Jen Chiang, and Han-Wei Shen. Accelerating time-varying

hardware volume rendering using TSP trees and color-based error metrics. In

Proceedings of the 2000 IEEE Symposium on Volume Visualization, VVS ’00, pages

119–128, New York, NY, USA, 2000. ACM. 11, 13, 17, 33, 48

[7] Cyril Crassin, Fabrice Neyret, Sylvain Lefebvre, and Elmar Eisemann. Gigavoxels:

Ray-guided streaming for efficient and detailed voxel rendering. In Proceedings of

53

REFERENCES

the 2009 Symposium on Interactive 3D Graphics and Games, I3D ’09, pages 15–22,

New York, NY, USA, 2009. ACM. 13, 17, 49

[8] Daniel Reiter Horn, Jeremy Sugerman, Mike Houston, and Pat Hanrahan. Interac-

tive k-d tree GPU raytracing. In Proceedings of the 2007 symposium on Interactive

3D graphics and games, I3D ’07, pages 167–174, New York, NY, USA, 2007. ACM.

22

[9] Hong Xie, Leon Ofman, and Gareth Lawrence. Cone model for halo CMEs: Ap-

plication to space weather forecasting. J. Geophys. Res., 109(A03109), 2004. 23

[10] Community Coordinated Modeling Center. http://ccmc.gsfc.nasa.gov/. Ac-

cessed: 2014-02-09. 23

[11] Community Coordinated Modeling Center. Kameleon - conversion, access, interpo-

lation. http://ccmc.gsfc.nasa.gov/downloads/kameleon.pdf, 2006. Accessed:

2014-02-09. 24

[12] G. M. Morton. A computer oriented geodetic data base and a new technique in

file sequencing. In IBM Germany Scientific Symposium Series, 1966. 29

[13] J. Kruger and R. Westermann. Acceleration techniques for GPU-based volume

rendering. In Proceedings of the 14th IEEE Visualization 2003 (VIS’03), VIS ’03,

pages 38–, Washington, DC, USA, 2003. IEEE Computer Society. 33

[14] NVIDIA Corporation. NVIDIA OpenCL best practices guide, 2009. 37

54

Appendix A

Code Samples

A.1 TSP Tree Traversal

1 // OpenCL ke rne l

2

3 // Mirrors s t r u c t on hos t s i d e

4 struct Traversa lConstants {

5 int gridType ;

6 f loat s t e p s i z e ;

7 int numTimesteps ;

8 int numValuesPerNode ;

9 int numOTNodes ;

10 f loat tempora lTo lerance ;

11 f loat s p a t i a lTo l e r an c e ;

12 } ;

13

14 // Return index to l e f t BST c h i l d ( low timespan )

15 int LeftBST ( int bstNodeIndex , int numValuesPerNode , int numOTNodes ,

16 bool bstRoot , g l o b a l r e ad on l y int ∗ t sp ) {

17 // I f the BST node i s a root , the c h i l d po in t e r i s used f o r the OT.

18 // The c h i l d index i s next to the roo t .

19 // I f not root , l ook up in TSP s t r u c t u r e .

20 i f ( bstRoot ) {

21 return bstNodeIndex + numOTNodes ;

22 // re turn bs tNodeIndex + 1;

23 } else {

24 return t sp [ bstNodeIndex ∗ numValuesPerNode + 1 ] ;

25 }

26 }

27

55

A. CODE SAMPLES

28 // Return index to r i g h t BST c h i l d ( h igh timespan )

29 int RightBST( int bstNodeIndex , int numValuesPerNode , int numOTNodes ,


31 i f ( bstRoot ) {

32 return bstNodeIndex + numOTNodes∗2 ;

33 } else {

34 return t sp [ bstNodeIndex ∗ numValuesPerNode + 1 ] + numOTNodes ;

35 }

36 }

37

38 // Return c h i l d node index g iven a BST node , a time span and a t imes t ep

39 // Updates timespan

40 int ChildNodeIndex ( int bstNodeIndex , int ∗ t imespanStart , int ∗

timespanEnd ,

41 int t imestep , int numValuesPerNode , int numOTNodes ,


43 // Choose l e f t or r i g h t c h i l d

44 int middle = ∗ t imespanStart + (∗ timespanEnd − ∗ t imespanStart ) /2 ;

45 i f ( t imes t ep <= middle ) {

46 // Le f t s u b t r e e

47 ∗ timespanEnd = middle ;

48 return LeftBST ( bstNodeIndex , numValuesPerNode , numOTNodes ,

49 bstRoot , t sp ) ;

50 } else {

51 // Right su b t r e e

52 ∗ t imespanStart = middle+1;

53 return RightBST( bstNodeIndex , numValuesPerNode , numOTNodes ,


55 }

56 }

57

58 // Return the b r i c k index t ha t a BST node r ep r e s en t s

59 int BrickIndex ( int bstNodeIndex , int numValuesPerNode ,

60 g l o b a l r e ad on l y int ∗ t sp ) {

61 return t sp [ bstNodeIndex ∗ numValuesPerNode + 0 ] ;

62 }

63

64 // Checks i f a BST node i s a l e a f o t not

65 bool IsBSTLeaf ( int bstNodeIndex , int numValuesPerNode ,


67 i f ( bstRoot ) return fa l se ;

68 return ( t sp [ bstNodeIndex ∗ numValuesPerNode + 1 ] == −1) ;

69 }

70

71 // Checks i f an OT node i s a l e a f or not

56


72 bool I sOct reeLea f ( int otNodeIndex , int numValuesPerNode ,


74 // CHILD INDEX i s at o f f s e t 1 , and −1 r ep r e s en t s l e a f

75 return ( t sp [ otNodeIndex∗ numValuesPerNode + 1 ] == −1) ;

76 }

77

78 // Return OT ch i l d index g iven curren t node and c h i l d number (0−7)

79 int OTChildIndex ( int otNodeIndex , int numValuesPerNode ,

80 int ch i l d ,


82 int f i r s t C h i l d = t sp [ otNodeIndex∗ numValuesPerNode + 1 ] ;

83 return f i r s t C h i l d + ch i l d ;

84 }

85

86

87 f loat TemporalError ( int bstNodeIndex , int numValuesPerNode ,


89 return a s f l o a t ( t sp [ bstNodeIndex ∗ numValuesPerNode + 3 ] ) ;

90 }

91

92 f loat Spa t i a lE r ro r ( int bstNodeIndex , int numValuesPerNode ,


94 return a s f l o a t ( t sp [ bstNodeIndex ∗ numValuesPerNode + 2 ] ) ;

95 }

96

97 // Given a point , a box mid va lue and an o f f s e t ,

98 // retuen enc l o s i n g oc t r e e c h i l d

99 int Enc los ingChi ld ( f l o a t 3 P , f loat boxMid , f l o a t 3 o f f s e t ) {

100 i f ( P . x < boxMid+ o f f s e t . x ) {

101 i f ( P . y < boxMid+ o f f s e t . y ) {

102 i f ( P . z < boxMid+ o f f s e t . z ) {

103 return 0 ;

104 } else {

105 return 4 ;

106 }

107 } else {


109 return 2 ;

110 } else {

111 return 6 ;

112 }

113 }

114 } else {

115 i f ( P . y < boxMid+ o f f s e t . y ) {


57

A. CODE SAMPLES

117 return 1 ;

118 } else {

119 return 5 ;

120 }

121 } else {


123 return 3 ;

124 } else {

125 return 7 ;

126 }

127 }

128 }

129 }

130

131 // Update oc t r e e o f f s e t

132 void UpdateOffset ( f l o a t 3 ∗ o f f s e t , f loat boxDim , int c h i l d ) {

133 i f ( c h i l d == 0) {

134 // do noth ing

135 } else i f ( c h i l d == 1) {

136 o f f s e t −>x += boxDim ;

137 } else i f ( c h i l d == 2) {

138 o f f s e t −>y += boxDim ;

139 } else i f ( c h i l d == 3) {



142 } else i f ( c h i l d == 4) {

143 o f f s e t −>z += boxDim ;

144 } else i f ( c h i l d == 5) {



147 } else i f ( c h i l d == 6) {



150 } else i f ( c h i l d == 7) {

151 ∗ o f f s e t += ( f l o a t 3 ) ( boxDim) ;

152 }

153 }

154

155 // Given an oc t r e e node index , t r a v e r s e the corresponding BST t r e e and

look

156 // f o r a u s e f u l b r i c k .

157 bool TraverseBST ( int otNodeIndex , int ∗ br i ckIndex , int t imestep ,

158 con s t an t struct Traversa lConstants ∗ cons tant s ,

159 g l o b a l volat i le int ∗ r eqL i s t ,


58


161

162 // S ta r t a t the roo t o f the curren t BST

163 int bstNodeIndex = otNodeIndex ;

164 bool bstRoot = true ;

165 int t imespanStart = 0 ;

166 int timespanEnd = constant s−>numTimesteps ;

167

168 // Rely on s t r u c t u r e f o r terminat ion

169 while ( true ) {

170

171 // Update b r i c k index ( r e g a r d l e s s i f we use i t or not )

172 ∗ br i ck Index = BrickIndex ( bstNodeIndex ,

173 cons tant s−>numValuesPerNode ,

174 t sp ) ;

175

176 // I f temporal e r ror i s ok

177 i f ( TemporalError ( bstNodeIndex , cons tant s−>numValuesPerNode ,

178 t sp ) <= constant s−>tempora lTo lerance ) {

179

180 // I f the ot node i s a l e a f , we can ’ t do any b e t t e r s p a t i a l l y so we

181 // re turn the curren t b r i c k

182 i f ( I sOct reeLea f ( otNodeIndex , cons tant s−>numValuesPerNode , t sp ) )

{

183 return true ;

184

185 // A l l i s w e l l !

186 } else i f ( Spa t i a lE r ro r ( bstNodeIndex , cons tant s−>numValuesPerNode ,

187 t sp ) <= constant s−>s p a t i a lTo l e r an c e ) {

188 return true ;

189

190 // I f s p a t i a l f a i l e d and the BST node i s a l e a f

191 // The t r a v e r s a l w i l l cont inue in the oc t r e e (we know tha t

192 // the oc t r e e node i s not a l e a f )

193 } else i f ( IsBSTLeaf ( bstNodeIndex , cons tant s−>numValuesPerNode ,

194 bstRoot , t sp ) ) {

195 return fa l se ;

196

197 // Keep t r a v e r s i n g BST

198 } else {

199 bstNodeIndex = ChildNodeIndex ( bstNodeIndex , &timespanStart ,

200 &timespanEnd , t imestep ,


202 cons tant s−>numOTNodes ,


204 }

59

A. CODE SAMPLES

205

206 // I f temporal e r ror i s too b i g and the node i s a l e a f

207 // Return f a l s e to t r a v e r s e OT

208 } else i f ( IsBSTLeaf ( bstNodeIndex , cons tant s−>numValuesPerNode ,

209 bstRoot , t sp ) ) {


211

212 // I f temporal e r ror i s too b i g and we can cont inue

213 } else {

214 bstNodeIndex = ChildNodeIndex ( bstNodeIndex , &timespanStart ,

215 &timespanEnd , t imestep ,


217 cons tant s−>numOTNodes ,


219 }

220

221 bstRoot = fa l se ;

222 }

223 }

224

225

226 // Traverse one ray through the volume , b u i l d b r i c k l i s t

227 void TraverseOctree ( f l o a t 3 rayO ,

228 f l o a t 3 rayD ,

229 f loat maxDist ,

230 con s t an t struct Traversa lConstants ∗ cons tant s ,

231 g l o b a l volat i le int ∗ r eqL i s t ,

232 g l o b a l r e ad on l y int ∗ t sp ,

233 const int t imes t ep ) {

234

235 f loat s t e p s i z e = constant s−>s t e p s i z e ;

236 f l o a t 3 P = rayO ;

237 // Keep t r a v e r s i n g u n t i l the sample po in t goes ou t s i d e the un i t cube

238 f loat t r ave r s ed = 0 . 0 ;

239 while ( t r ave r s ed < maxDist ) {

240

241 // Reset t r a v e r s a l v a r i a b l e s

242 f l o a t 3 o f f s e t = ( f l o a t 3 ) ( 0 . 0 ) ;

243 f loat boxDim = 1 . 0 ;

244 int ch i l d ;

245

246 // I n i t the oc t r e e node index to the roo t

247 int otNodeIndex = OctreeRootNodeIndex ( ) ;

248

249 // S ta r t t r a v e r s i n g oc t r e e

60


250 // Rely on f i n d i n g a l e a f f o r loop terminat ion

251 while ( true ) {

252

253 // See i f the BST t r e e i s good enough

254 int br i ckIndex = 0 ;

255 bool bs tSucce s s = TraverseBST ( otNodeIndex , &brickIndex , t imestep ,

256 cons tant s , r eqL i s t , t s p ) ;

257

258 i f ( b s tSucce s s ) {

259

260 // V i s i t and use b r i c k ( e . g . prob ing or render ing )

261 UseBrick ( br i ck Index ) ;

262 // We are now done wi th t h i s node , so go to next

263 break ;

264

265 } else i f ( I sOct reeLea f ( otNodeIndex ,

266 cons tant s−>numValuesPerNode , t sp ) ) {

267 // I f the BST lookup f a i l e d but the oc t r e e node i s a l e a f ,

268 // use the b r i c k anyway ( i t i s the BST l e a f )

269 UseBrick ( br i ck Index ) ;

270 // We are now done wi th t h i s node , so go to next

271 break ;

272

273 } else {

274 // I f the BST lookup f a i l e d and we can t r a v e r s e the octree ,

275 // v i s i t the c h i l d t h a t enc l o s e s the po in t

276

277 // Next box dimension

278 boxDim = boxDim /2 . 0 ;

279

280 // Current mid po in t

281 f loat boxMid = boxDim ;

282

283 // Check which c h i l d enc l o s e s P

284

285 i f ( cons tant s−>gridType == 0) { // Cartes ian

286 ch i l d = Enc los ingChi ld (P, boxMid , o f f s e t ) ;

287 } else { // Sphe r i c a l (==1)

288 ch i l d = Enc los ingChi ld ( Carte s ianToSpher i ca l (P) , boxMid , o f f s e t ) ;

289 }

290

291 // Update o f f s e t

292 UpdateOffset(& o f f s e t , boxDim , ch i l d ) ;

293

294 // Update node index to new node

61

A. CODE SAMPLES

295 int oldIndex = otNodeIndex ;

296 otNodeIndex = OTChildIndex ( otNodeIndex , cons tant s−>

numValuesPerNode ,

297 ch i ld , t sp ) ;

298 }

299

300 } // wh i l e t r a v e r s i n g

301

302 // Update

303 t rave r s ed += s t e p s i z e ;

304 P += s t e p s i z e ∗ rayD ;

305

306 } // wh i l e ( t r a v e r s ed < maxDist )

307 }

A.2 Brick Padding

1 // Loop over a l l t imes t ep s

2 for (unsigned int i =0; i<numTimesteps ; ++i ) {

3

4 // Storage f o r one time s t ep o f the raw data

5 std : : vector<f loat> t imestepData ( xDim ∗yDim ∗zDim ,

6 static cast<f loat >(0) ) ;

7

8 // Point to the r i g h t p o s i t i o n in the f i l e stream and read i t

9 o f f t imes t epS i z e = xDim ∗yDim ∗zDim ∗ da taS i z e ;

10 o f f t ime s t epOf f s e t = static cast<o f f >( i ) ∗ t imes t epS i z e+heade rO f f s e t ;

11 f s e eko ( in , t imes tepOf f s e t , SEEK SET) ;

12 f r ead ( reinterpret cast<void∗>(&timestepData [ 0 ] ) , t imestepS ize , 1 , in ) ;

13

14 // We now have a non−padded time step , and need to pad the borders

15

16 // A l l o ca t e space f o r the padded data

17 std : : vector<f loat> paddedData ( xPaddedDim ∗yPaddedDim ∗zPaddedDim ,


19

20 // Loop over the padded volume t ha t we want to f i l l

21 // xp = ”x padded”

22 // xo = ”x o r i g i n a l ”

23 unsigned int xo , yo , zo ;

24 for (unsigned int zp=0; zp<zPaddedDim ; ++zp ) {

25 for (unsigned int yp=0; yp<yPaddedDim ; ++yp) {

26 for (unsigned int xp=0; xp<xPaddedDim ; ++xp) {

62

A.2 Brick Padding

27

28 i f ( xp == 0) {

29 xo = xp ;

30 } else i f ( xp == xPaddedDim −1) {

31 xo = xp−2;

32 } else {

33 xo = xp−1;

34 }

35

36 i f ( yp == 0) {

37 yo = yp ;

38 } else i f ( yp == yPaddedDim −1) {

39 yo = yp−2;

40 } else {

41 yo = yp−1;

42 }

43

44 i f ( zp == 0) {

45 zo = zp ;

46 } else i f ( zp == zPaddedDim −1) {

47 zo = zp−2;

48 } else {

49 zo = zp−1;

50 }

51

52 paddedData [ xp + yp∗xPaddedDim + zp∗xPaddedDim ∗yPaddedDim ] =

53 timestepData [ xo + yo∗xDim + zo∗yDim ∗zDim ] ;

54 }

55 }

56 }

57

58 // Create a conta iner f o r the oc t r e e l e a f l e v e l b r i c k s

59 std : : vector<Brick<f loat>∗ > baseLeve lBr i cks ( numBricksBaseLevel , NULL) ;

60

61 // Loop over the volume ’ s subvolumes and c rea t e one b r i c k f o r each

62 for (unsigned int zBr ick=0; zBrick<zNumBricks ; ++zBrick ) {

63 for (unsigned int yBrick=0; yBrick<yNumBricks ; ++yBrick ) {

64 for (unsigned int xBrick=0; xBrick<xNumBricks ; ++xBrick ) {

65

66 Brick<f loat> ∗ br i ck = Brick<f loat > : :New( xPaddedBrickDim ,

67 yPaddedBrickDim ,

68 zPaddedBrickDim ,


70

71 // Loop over the subvolume ’ s v o x e l s

63

A. CODE SAMPLES

72 unsigned int xMin = xBrick ∗ xBrickDim ;

73 unsigned int xMax = ( xBrick + 1) ∗xBrickDim −1+paddingWidth ∗2 ;

74 unsigned int yMin = yBrick ∗ yBrickDim ;

75 unsigned int yMax = ( yBrick + 1) ∗yBrickDim −1+paddingWidth ∗2 ;

76 unsigned int zMin = zBrick ∗ zBrickDim ;

77 unsigned int zMax = ( zBrick + 1) ∗zBrickDim −1+paddingWidth ∗2 ;

78

79 unsigned int zLoc= 0 ;

80 for (unsigned int zSub=zMin ; zSub<=zMax ; ++zSub ) {

81 unsigned int yLoc = 0 ;

82 for (unsigned int ySub=yMin ; ySub<=yMax ; ++ySub ) {

83 unsigned int xLoc = 0 ;

84 for (unsigned int xSub=xMin ; xSub<=xMax ; ++xSub ) {

85 // Look up g l o b a l index in f u l l volume

86 unsigned int g loba l Index =

87 xSub + ySub∗xPaddedDim + zSub∗xPaddedDim ∗yPaddedDim ;

88 // Set data at l o c a l subvolume index

89 br ick−>SetData ( xLoc , yLoc , zLoc , paddedData [ g l oba l Index ] ) ;

90 xLoc++;

91 }

92 yLoc++;

93 }

94 zLoc++;

95 }

96

97 // Save to oc t r e e l e a f l e v e l

98 unsigned int br i ckIndex =

99 xBrick + yBrick ∗xNumBricks + zBrick ∗xNumBricks ∗yNumBricks ;

100 baseLeve lBr i cks [ br i ck Index ] = br i ck ;

101 }

102 }

103 }

104 }

A.3 Octree Construction

1 // Loop over a l l t imes t ep s

2 for (unsigned int i =0; i<numTimesteps ; ++i ) {

3

4 // Make a conta iner f o r a l l t he oc t r e e b r i c k s

5 std : : vector<Brick<f loat>∗ > o c t r e eBr i c k s ( numBricksPerOctree ) ;

6

7 // Use Z−order coord ina t e s to rearrange the base l e v e l b r i c k s

64

A.3 Octree Construction

8 // so t ha t the e i g h t c h i l d r en f o r each parent node l i e

9 // next to each o ther

10 for ( u i n t 16 t z=0; z<static cast<u int16 t >(xNumBricks ) ; ++z ) {

11 for ( u i n t 16 t y=0; y<static cast<u int16 t >(yNumBricks ) ; ++y) {

12 for ( u i n t 16 t x=0; x<static cast<u int16 t >(zNumBricks ) ; ++x) {

13 unsigned int zOrderIdx =

14 static cast<unsigned int>(ZOrder (x , y , z ) ) ;

15 unsigned int idx = x + y∗xNumBricks + z∗xNumBricks ∗yNumBricks ;

16 o c t r e eBr i c k s [ zOrderIdx ] = baseLeve lBr i cks [ idx ] ;

17 }

18 }

19 }

20

21 // Construct h i ghe r l e v e l s o f o c t r e e

22

23 // Pos i t i on f o r next b r i ck , s t a r t i n g at p o s i t i o n beyond base l e v e l

24 unsigned int br ickPos = numBricksBaseLevel ;

25 // Pos i t i on f o r f i r s t c h i l d to average

26 unsigned int ch i ldPos = 0 ;

27

28 while ( br ickPos < numBricksPerOctree ) {

29 // F i l t e r the e i g h t c h i l d r en and then combine them to b u i l d

30 // the h i ghe r l e v e l node

31 std : : vector<Brick<f loat>∗ > f i l t e r e dCh i l d r e n (8 , NULL) ;

32 unsigned int i =0;

33 for (unsigned int ch i l d=ch i ldPos ; ch i ld<ch i ldPos+8; ++ch i l d ) {

34 Brick<f loat> ∗ f i l t e r e dCh i l d =

35 Brick<f loat > : : F i l t e r ( o c t r e eBr i c k s [ c h i l d ] ) ;

36 f i l t e r e dCh i l d r e n [ i++] = f i l t e r e dCh i l d ;

37 }

38 Brick<f loat> ∗newBrick = Brick<f loat > : : Combine ( f i l t e r e dCh i l d r e n ) ;

39

40 // Free up some memory

41 for (auto i t=f i l t e r e dCh i l d r e n . begin ( ) ;

42 i t != f i l t e r e dCh i l d r e n . end ( ) ; ++i t ) {

43 delete ∗ i t ;

44 ∗ i t = NULL;

45 }

46

47 // Set next c h i l d pos

48 ch i ldPos += 8 ;

49

50 // Save new b r i c k

51 o c t r e eBr i c k s [ br ickPos++] = newBrick ;

52 }

65

A. CODE SAMPLES

53

54 // Write oc t r e e to f i l e

55 for (auto i t=oc t r e eBr i c k s . beg in ( ) ; i t != oc t r e eBr i c k s . end ( ) ; ++i t ) {

56 fw r i t e ( reinterpret cast<void∗>(&(∗ i t )−>data [ 0 ] ) ,

57 static cast<s i z e t >((∗ i t )−>S i z e ( ) ) , 1 , out ) ;

58 // Free memory when we ’ re done

59 delete ∗ i t ;

60 }

61 }

A.4 BST Assembling

1 // I f the number o f t imes t ep s i s not a power o f two , copy the l a s t

t imes t ep s

2 // enough t imes to make the number a power o f two

3 CheckPowerOfTwo ( ) ;

4

5 // Create base l e v e l temp f i l e by r e v e r s i n g the l e v e l order

6

7 { // Scoping f i l e s

8

9 std : : FILE ∗ in = fopen ( tempFilename . c s t r ( ) , ” r ” ) ;

10 i f ( ! in ) return fa l se ;

11 std : : FILE ∗out = fopen ( newFilename . c s t r ( ) , ”w” ) ;

12 i f ( ! out ) return fa l se ;

13

14 // Read one oc t r e e l e v e l a t a time , s t a r t i n g from the back o f source

15 // Write to out f i l e in r e v e r s e order

16

17 // Pos i t i on at end o f f i l e

18 for (unsigned int t s =0; ts<numTimesteps ; ++t s ) {

19

20 o f f oct reePos=static cast<o f f >((numOTNodes) ∗numBrickVals ∗( t s+1) ) ;

21 for (unsigned int l e v e l =0; l e v e l<numLevels ; ++l e v e l ) {

22

23 unsigned int br i ck sPerLeve l = pow(8 , l e v e l ) ;

24 unsigned int va luesPerLeve l = numBrickVals∗ br i ck sPerLeve l ;

25 octreePos −= valuesPerLeve l ;

26 std : : vector<f loat> bu f f e r ( va luesPerLeve l ) ;

27

28 f s e eko ( in , oct reePos ∗( o f f ) s izeof ( f loat ) , SEEK SET) ;

29 s i z e t r eadS i z e = static cast<s i z e t >(va luesPerLeve l ) ∗ s izeof ( f loat ) ;

30 f r ead ( reinterpret cast<void∗>(&bu f f e r [ 0 ] ) , readSize , 1 , in ) ;

66

A.4 BST Assembling

31 fw r i t e ( reinterpret cast<void∗>(&bu f f e r [ 0 ] ) , readSize , 1 , out ) ;

32 }

33 }

34

35 f c l o s e ( in ) ;

36 f c l o s e ( out ) ;

37

38 } // Scoping f i l e s

39

40 // Create one f i l e f o r every l e v e l o f the BST t r e e s t r u c t u r e

41 // by averag ing the va l u e s in the one be low .

42 unsigned int numTimestepsInLevel = numTimesteps ;

43 unsigned int numValsInOT = numBrickVals∗numOTNodes ;

44 std : : vector<f loat> i nBu f f e r 1 (numValsInOT) ;

45 std : : vector<f loat> i nBu f f e r 2 (numValsInOT) ;

46 std : : vector<f loat> outBuf f e r (numValsInOT) ;

47

48 s i z e t OTBytes = static cast<s i z e t >(numValsInOT ∗ s izeof ( f loat ) ) ;

49 std : : s t r i n g fromFilename = newFilename ;

50 std : : s t r i n g toFilename ;

51

52 do {

53

54 std : : s t r i ng s t r eam s s ;

55 s s << BSTLevel − 1 ;

56 std : : cout << ”Creat ing l e v e l ” << BSTLevel << std : : endl ;

57 toFilename = tempFilename + ” . ” + s s . s t r ( ) + ” . tmp” ;

58

59 // I n i t f i l e s

60 std : : FILE ∗ in = fopen ( fromFilename . c s t r ( ) , ” r ” ) ;


62 std : : FILE ∗out = fopen ( toFilename . c s t r ( ) , ”w” ) ;

63 i f ( ! out ) return fa l se ;

64

65 f s e eko ( in , 0 , SEEK END) ;

66 o f f f i l e S i z e = f t e l l o ( in ) ;

67 f s e eko ( in , 0 , SEEK SET) ;

68

69 for (unsigned int t s =0; ts<numTimestepsInLevel ; t s+=2) {

70

71 // Read two oc t r e e s ( two time s t e p s )

72 f r ead ( reinterpret cast<void∗>(&inBu f f e r 1 [ 0 ] ) , OTBytes , 1 , in ) ;

73 f r ead ( reinterpret cast<void∗>(&inBu f f e r 2 [ 0 ] ) , OTBytes , 1 , in ) ;

74

75 // Average time s t e p s

67

A. CODE SAMPLES

76 for (unsigned int i =0; i<outBuf f e r . s i z e ( ) ; ++i ) {

77 outBuf f e r [ i ] = ( inBu f f e r 1 [ i ] + inBu f f e r 2 [ i ] ) / static cast<f loat >(2)

;

78

79 }

80

81 // Write b r i c k

82 fw r i t e ( reinterpret cast<void∗>(&outBuf f e r [ 0 ] ) , OTBytes , 1 , out ) ;

83 }

84

85 fromFilename = toFilename ;

86

87 f c l o s e ( in ) ;

88 f c l o s e ( out ) ;

89

90 BSTLevel−−;

91 numTimestepsInLevel /= 2 ;

92

93 } while (BSTLevel != 0) ;

94

95 std : : FILE ∗out = fopen ( outFi lename . c s t r ( ) , ”w” ) ;

96

97 // Write metadata to f i l e

98 WriteHeader ( out ) ;

99

100 // Write each l e v e l to output

101 for (unsigned int l e v e l =0; l e v e l<numBSTLevels ; ++l e v e l ) {

102

103 std : : s t r i ng s t r eam s s ;

104 s s << l e v e l ;

105 std : : s t r i n g fromFilename = tempFilename + ” . ” + s s . s t r ( ) + ” . tmp” ;

106

107 std : : FILE ∗ in = fopen ( fromFilename . c s t r ( ) , ” r ” ) ;


109

110 f s e eko ( in , 0 , SEEK END) ;

111 o f f i n F i l e S i z e = f t e l l o ( in ) ;

112 f s e eko ( in , 0 , SEEK SET) ;

113

114 std : : vector<f loat> bu f f e r ( ( s i z e t ) i nF i l e S i z e / s izeof ( f loat ) ) ;

115 // Read whole f i l e , wr i t e to out f i l e

116 f r ead ( reinterpret cast<void∗>(&bu f f e r [ 0 ] ) ,

117 static cast<s i z e t >( i nF i l e S i z e ) , 1 , in ) ;

118

119 fw r i t e ( reinterpret cast<void∗>(&bu f f e r [ 0 ] ) ,

68

A.5 TSP Tree Structure Construction

120 static cast<s i z e t >( i nF i l e S i z e ) , 1 , out ) ;

121

122 f c l o s e ( in ) ;

123 }

124 f c l o s e ( out ) ;

125

126 // Do some check ing and data v a l i d a t i o n

127 CheckFi l eS i ze ( ) ;

128 Val idateData ( ) ;

A.5 TSP Tree Structure Construction

1 void TSP : : Construct ( ) {

2 // S t ruc tu re i s saved in i n t array

3

4 // Loop over the OTs ( one per BST node )

5 for (unsigned int OT=0; OT<numBSTNodes ; ++OT) {

6

7 // S ta r t a t the roo t o f each OT

8 unsigned int OTNode = OT∗numOTNodes ;

9

10 // Ca l cu l a t e BST l e v e l ( f i r s t l e v e l i s l e v e l 0)

11 unsigned int BSTLevel = (unsigned int ) ( l og (OT+1)/ log (2 ) ) ;

12

13 // Traverse OT

14 unsigned int OTChild = 1 ;

15 unsigned int OTLevel = 0 ;

16 while (OTLevel < numOTLevels ) {

17

18 unsigned int OTNodesInLevel = static cast<unsigned int>(pow(8 ,

OTLevel ) ) ;

19

20 for (unsigned int i =0; i<OTNodesInLevel ; ++i ) {

21

22 // Brick index

23 data [OTNode∗NUMDATA + BRICK INDEX] = ( int )OTNode ;

24

25 // Error metr i c s

26 int localOTNode = (OTNode − OT∗numOTNodes ) ;

27 data [OTNode∗NUMDATA + TEMPORALERR] = ( int ) ( numBSTLevels −1−

BSTLevel ) ;

28 data [OTNode∗NUMDATA + SPATIAL ERR] = ( int ) ( numOTLevels −1−

OTLevel ) ;

69

A. CODE SAMPLES

29

30 i f (BSTLevel == 0) {

31

32 // Ca l cu l a t e OT c h i l d index (−1 i f node i s l e a f )

33 int OTChildIndex =

34 (OTChild < numOTNodes ) ? ( int ) (OT∗numOTNodes +OTChild ) : −1;

35 data [OTNode∗NUMDATA + CHILD INDEX] = OTChildIndex ;

36

37 } else {

38

39 // Ca l cu l a t e BST c h i l d index (−1 i f node i s BST l e a f )

40

41 // F i r s t BST node o f curren t l e v e l

42 int f i r s tNode =

43 static cast<unsigned int>((2∗pow(2 , BSTLevel−1)−1)∗numOTNodes

) ;

44 // F i r s t BST node o f next l e v e l

45 int f i r s t C h i l d =

46 static cast<unsigned int>((2∗pow(2 , BSTLevel )−1)∗numOTNodes ) ;

47 // Di f f e r ence between f i r s t nodes between l e v e l s

48 int l eve lGap = f i r s tCh i l d−f i r s tNode ;

49 // How many nodes away from the f i r s t node are we?

50 int o f f s e t = (OTNode−f i r s tNode ) / numOTNodes ;

51

52 // Use l e v e l gap and o f f s e t to c a l c u l a t e c h i l d index

53 int BSTChildIndex =

54 (BSTLevel < numBSTLevels −1) ?

55 ( int ) (OTNode+levelGap+( o f f s e t ∗numOTNodes ) ) : −1;

56

57 data [OTNode∗NUMDATA + CHILD INDEX] = BSTChildIndex ;

58

59 }

60

61 OTNode++;

62 OTChild += 8 ;

63 }

64

65 OTLevel++;

66 }

67 }

68 }

70

A.6 Error Metrics

A.6 Error Metrics

1 bool TSP : : Ca l cu l a t eSpa t i a lE r r o r ( ) {

2

3 unsigned int numBrickVals = paddedBrickDim ∗paddedBrickDim ∗

paddedBrickDim ;

4

5 std : : s t r i n g inFi lename = con f i g −>TSPFilename ( ) ;

6 std : : FILE ∗ in = fopen ( inFi lename . c s t r ( ) , ” r ” ) ;

7 i f ( ! in ) {

8 ERROR(” Fa i l ed to open” << inFi lename ) ;

9 return fa l se ;

10 }

11

12 std : : vector<f loat> bu f f e r ( numBrickVals ) ;

13 std : : vector<f loat> averages ( numTotalNodes ) ;

14 std : : vector<f loat> stdDevs ( numTotalNodes ) ;

15

16 // F i r s t pass : Ca l cu l a t e average co l o r f o r each b r i c k

17 INFO(”\ nCa lcu la t ing s p a t i a l e r ro r , f i r s t pass ” ) ;

18 for (unsigned int br i ck =0; br ick<numTotalNodes ; ++br i ck ) {

19

20 // O f f s e t in f i l e

21 o f f o f f s e t = dataPos + static cast<o f f >( b r i ck ∗numBrickVals∗ s izeof (

f loat ) ) ;

22 f s e eko ( in , o f f s e t , SEEK SET) ;

23


25 static cast<s i z e t >(numBrickVals ) ∗ s izeof ( f loat ) , 1 , in ) ;

26

27 f loat average = 0 . f ;

28 for (auto i t=bu f f e r . begin ( ) ; i t != bu f f e r . end ( ) ; ++i t ) {

29 average += ∗ i t ;

30 }

31

32 averages [ b r i ck ] = average / static cast<f loat>(numBrickVals ) ;

33 }

34

35 // Spa t i a l SNR s t a t s

36 f loat minError = 1 e20 f ;

37 f loat maxError = 0 . f ;

38 std : : vector<f loat> medianArray ( numTotalNodes ) ;

39

40 // Second pass : For each br i ck , compare the covered l e a f v o x e l s wi th

71

A. CODE SAMPLES

41 // the b r i c k average

42 INFO(”Ca l cu l a t ing s p a t i a l e r ro r , second pass ” ) ;


44

45 // Fetch mean i n t e n s i t y

46 f loat brickAvg = averages [ b r i ck ] ;

47

48 // Sum fo r s t d dev computation

49 f loat stdDev = 0 . f ;

50

51 // Get a l i s t o f l e a f b r i c k s t h a t the curren t b r i c k covers

52 std : : l i s t <unsigned int> coveredLea fBr i cks =

53 CoveredLeafBricks ( b r i ck ) ;

54

55 // I f the b r i c k i s a l r eady a l e a f , a s s i gn a nega t i v e er ror .

56 // Ad hoc ”hack” to d i s t i n g u i s h l e a f s from other nodes t ha t happens

57 // to ge t a zero error due to rounding e r ro r s or o ther reasons .

58 i f ( coveredLea fBr i cks . s i z e ( ) == 1) {

59 stdDev = −0.1 f ;

60 } else {

61

62 // Ca l cu l a t e ” s tandard d e v i a t i on ” corresponding to l e a v e s

63 for (auto lb=coveredLea fBr i cks . beg in ( ) ;

64 lb != coveredLea fBr i cks . end ( ) ; ++lb ) {

65

66 // Read b r i c k

67 o f f o f f s e t =

68 dataPos +static cast<o f f >((∗ lb ) ∗numBrickVals∗ s izeof ( f loat ) ) ;




72

73 // Add to sum

74 for (auto v=bu f f e r . begin ( ) ; v!= bu f f e r . end ( ) ; ++v) {

75 stdDev += pow(∗v−brickAvg , 2 . f ) ;

76 }

77

78 }

79

80 // Fin i sh c a l c u l a t i o n

81 i f ( s izeof ( f loat ) != s izeof ( int ) ) {

82 ERROR(”Float and i n t s i z e s don ’ t match , can ’ t r e i n t e p r e t ” ) ;

83 return fa l se ;

84 }

85

72

A.6 Error Metrics

86 stdDev /= static cast<f loat>( coveredLea fBr i cks . s i z e ( ) ∗numBrickVals ) ;

87 stdDev = sq r t ( stdDev ) ;

88

89 }

90

91 i f ( stdDev < minError ) {

92 minError = stdDev ;

93 } else i f ( stdDev > maxError ) {

94 maxError = stdDev ;

95 }

96

97 stdDevs [ b r i ck ] = stdDev ;

98 medianArray [ b r i ck ] = stdDev ;

99

100 }

101

102 f c l o s e ( in ) ;

103

104 std : : s o r t (medianArray . begin ( ) , medianArray . end ( ) ) ;

105 f loat medError = medianArray [ medianArray . s i z e ( ) / 2 ] ;

106

107 // ”Normalize ” e r ro r s

108 f loat minNorm = 1 e20 f ;

109 f loat maxNorm = 0 . f ;

110 for (unsigned int i =0; i<numTotalNodes ; ++i ) {

111 // f l o a t normal ized = ( stdDevs [ i ]−minError ) /(maxError−minError ) ;

112 i f ( stdDevs [ i ] > 0 . f ) {

113 stdDevs [ i ] = pow( stdDevs [ i ] , 0 . 5 f ) ;

114 }

115 data [ i ∗NUMDATA+SPATIAL ERR] = ∗ reinterpret cast<int∗>(&stdDevs [ i ] ) ;

116 i f ( stdDevs [ i ] < minNorm) {

117 minNorm = stdDevs [ i ] ;

118 } else i f ( stdDevs [ i ] > maxNorm) {

119 maxNorm = stdDevs [ i ] ;

120 }

121 }

122

123 std : : s o r t ( stdDevs . begin ( ) , stdDevs . end ( ) ) ;

124 f loat medNorm = stdDevs [ stdDevs . s i z e ( ) / 2 ] ;

125

126 minSpat ia lErro r = minNorm ;

127 maxSpat ia lError = maxNorm ;

128 medianSpat ia lError = medNorm ;

129

130 return true ;

73

A. CODE SAMPLES

131 }

132

133

134 bool TSP : : CalculateTemporalError ( ) {

135

136 std : : s t r i n g inFi lename = con f i g −>TSPFilename ( ) ;

137 std : : FILE ∗ in = fopen ( inFi lename . c s t r ( ) , ” r ” ) ;

138 i f ( ! in ) {

139 ERROR(” Fa i l ed to open ” << inFi lename ) ;


141 }

142

143 std : : vector<f loat> meanArray ( numTotalNodes ) ;

144

145 // Save e r ro r s

146 std : : vector<f loat> e r r o r s ( numTotalNodes ) ;

147

148 // Ca l cu l a t e temporal e r ror f o r one b r i c k at a time


150

151 unsigned int numBrickVals =

152 paddedBrickDim ∗paddedBrickDim ∗paddedBrickDim ;

153

154 // Save the i n d i v i d u a l v o x e l ’ s average over t imes t ep s . Because the

155 // BSTs are b u i l t by averag ing l e a f nodes , we on ly need to sample

156 // the b r i c k at the co r r e c t coord ina te .

157 std : : vector<f loat> voxelAverages ( numBrickVals ) ;

158 std : : vector<f loat> voxelStdDevs ( numBrickVals ) ;

159

160 // Read the whole b r i c k to f i l l t he averages

161 o f f o f f s e t = dataPos +static cast<o f f >( b r i ck ∗numBrickVals∗ s izeof ( f loat

) ) ;


163 f r ead ( reinterpret cast<void∗>(&voxelAverages [ 0 ] ) ,


165

166 // Bui ld a l i s t o f the BST l e a f b r i c k s ( w i th in the same oc t r e e l e v e l )

t h a t

167 // t h i s b r i c k covers

168 std : : l i s t <unsigned int> coveredBr i cks = CoveredBSTLeafBricks ( b r i ck ) ;

169

170 // I f the b r i c k i s a t the l owe s t BST l e v e l , au t oma t i c a l l y s e t the

er ror

171 // to −0.1 ( enab l e s us ing −1 as a marker f o r ”no error accep ted ”) ;

172 // Somewhat ad hoc to ge t around the f a c t t h a t the error cou ld be

74

A.6 Error Metrics

173 // 0.0 h i ghe r up in the t r e e

174 i f ( coveredBr i cks . s i z e ( ) == 1) {

175 e r r o r s [ b r i ck ] = −0.1 f ;

176 } else {

177

178 // Ca l cu l a t e s tandard d e v i a t i on per voxe l , average over b r i c k

179 f loat avgStdDev = 0 . f ;

180 for (unsigned int voxe l =0; voxel<numBrickVals ; ++voxe l ) {

181

182 f loat stdDev = 0 . f ;

183 for (auto l e a f = coveredBr i cks . begin ( ) ;

184 l e a f != coveredBr i cks . end ( ) ; ++l e a f ) {

185

186 // Sample the l e a v e s at the corresponding vo x e l p o s i t i o n

187 o f f sampleOf f se t = dataPos +

188 static cast<o f f >((∗ l e a f ∗numBrickVals+voxe l ) ∗ s izeof ( f loat ) ) ;

189 f s e eko ( in , sampleOffset , SEEK SET) ;

190 f loat sample ;

191 f r ead ( reinterpret cast<void∗>(&sample ) , s izeof ( f loat ) , 1 , in ) ;

192

193 stdDev += pow( sample−voxelAverages [ voxe l ] , 2 . f ) ;

194 }

195 stdDev /= static cast<f loat>( coveredBr i cks . s i z e ( ) ) ;

196 stdDev = sq r t ( stdDev ) ;

197

198 avgStdDev += stdDev ;

199 } // f o r vo x e l

200

201 avgStdDev /= static cast<f loat>(numBrickVals ) ;

202 meanArray [ b r i ck ] = avgStdDev ;

203 e r r o r s [ b r i ck ] = avgStdDev ;

204

205 }

206 } // f o r a l l b r i c k s

207

208 f c l o s e ( in ) ;

209

210 std : : s o r t (meanArray . begin ( ) , meanArray . end ( ) ) ;

211 f loat medErr = meanArray [ meanArray . s i z e ( ) / 2 ] ;

212

213 // Adjust e r ro r s us ing user−prov ided exponents

214 f loat minNorm = 1 e20 f ;

215 f loat maxNorm = 0 . f ;

216 for (unsigned int i =0; i<numTotalNodes ; ++i ) {

217 i f ( e r r o r s [ i ] > 0 . f ) {

75

A. CODE SAMPLES

218 e r r o r s [ i ] = pow( e r r o r s [ i ] , 0 .25 f ) ;

219 }

220 data [ i ∗NUMDATA+TEMPORALERR] = ∗ reinterpret cast<int∗>(&e r r o r s [ i ] ) ;

221 i f ( e r r o r s [ i ] < minNorm) {

222 minNorm = e r r o r s [ i ] ;

223 } else i f ( e r r o r s [ i ] > maxNorm) {

224 maxNorm = e r r o r s [ i ] ;

225 }

226 }

227

228 std : : s o r t ( e r r o r s . begin ( ) , e r r o r s . end ( ) ) ;

229 f loat medNorm = e r r o r s [ e r r o r s . s i z e ( ) / 2 ] ;

230

231 minTemporalError = minNorm ;

232 maxTemporalError = maxNorm ;

233 medianTemporalError = medNorm ;

234

235 return true ;

236 }

A.7 Rendering Loop

1 bool Raycaster : : Render ( ) {

2

3 // Update t rans format ion matr ices and bind them to co l o r cube shader

4 i f ( ! UpdateMatrices ( ) ) return fa l se ;

5 i f ( ! BindTransformationMatr ices ( cubeShaderProgram ) ) return fa l se ;

6

7 g lC l ea r (GL COLOR BUFFER BIT | GL DEPTH BUFFER BIT) ;

8

9 // Render cube

10 glUseProgram ( cubeShaderProgram −>Handle ( ) ) ;

11 cubePos i t i onAt t r i b = cubeShaderProgram −>GetAttr ibLocat ion ( ” po s i t i o n ” ) ;

12 glFrontFace (GLCW) ;

13 glEnable (GL CULL FACE) ;

14

15 // Front cube

16 glBindFramebuffer (GL FRAMEBUFFER, cubeFrontFBO ) ;

17 g lCul lFace (GL BACK) ;

18 glBindVertexArray (cubeVAO ) ;

19 g lB indBuf f e r (GL ARRAY BUFFER, cubePosbuf f e rObject ) ;

20 g lEnableVertexAttr ibArray ( cubePos i t i onAt t r i b ) ;

21 g lVer t exAtt r ibPo in t e r (0 , 4 , GL FLOAT, GL FALSE, 0 , 0) ;

76

A.7 Rendering Loop


23 glDrawArrays (GL TRIANGLES, 0 , 144) ;

24 g lDi sab leVertexAttr ibArray ( cubePos i t i onAt t r i b ) ;

25 g lB indBuf f e r (GL ARRAY BUFFER, 0) ;

26 glBindFramebuffer (GL FRAMEBUFFER, 0) ;

27 glBindVertexArray (0 ) ;

28

29 // Back cube

30 glBindFramebuffer (GL FRAMEBUFFER, cubeBackFBO ) ;

31 g lCul lFace (GL FRONT) ;

32 glBindVertexArray (cubeVAO ) ;

33 g lB indBuf f e r (GL ARRAY BUFFER, cubePosbuf f e rObject ) ;

34 g lEnableVertexAttr ibArray ( cubePos i t i onAt t r i b ) ;




38 g lDi sab leVertexAttr ibArray ( cubePos i t i onAt t r i b ) ;

39 g lB indBuf f e r (GL ARRAY BUFFER, 0) ;



42

43 glUseProgram (0) ;

44

45 // Get curren t and next time s t ep from separa t e Animator c l a s s

46 unsigned int currentTimestep ;

47 unsigned int nextTimestep ;

48 currentTimestep = animator −>CurrentTimestep ( ) ;

49 nextTimestep = animator −>NextTimestep ( ) ;

50

51 // Choose b u f f e r s

52 BrickManager : : BUFFER INDEX currentBuf , nextBuf ;

53 i f ( currentTimestep % 2 == 0) {

54 currentBuf = BrickManager : :EVEN;

55 nextBuf = BrickManager : :ODD;

56 } else {

57 currentBuf = BrickManager : :ODD;

58 nextBuf = BrickManager : :EVEN;

59 }

60

61 // When s t a r t i n g a render ing i t e r a t i o n , the PBO corresponding to the

62 // curren t t imes t ep i s loaded wi th the data .

63

64 // Launch t r a v e r s a l o f the next t imes t ep

65 i f ( ! LaunchTSPProbing ( nextTimestep ) ) return fa l se ;

66

77

A. CODE SAMPLES

67 // While t r a v e r s a l o f next s t ep i s working , upload curren t data to a t l a s

68 i f ( ! brickManager −>PBOToAtlas ( currentBuf ) ) return fa l se ;

69

70 // Make sure the t r a v e r s a l k e rne l i s done

71 i f ( ! clManager −>FinishProgram ( ”TSPProbing” ) ) return fa l se ;

72

73 // Read b u f f e r and r e l e a s e the memory

74 i f ( ! clManager −>ReadBuffer ( ”TSPProbing” , t spBr ickL i s tArg ,

75 reinterpret cast<void∗>(&br i ckReques t [ 0 ] ) ,

76 br i ckReques t . s i z e ( ) ∗ s izeof ( int ) ,

77 true ) ) return fa l se ;

78

79 i f ( ! clManager −>Re l ea s eBu f f e r ( ”TSPProbing” , t spBr i ckL i s tArg ) ) return

fa l se ;

80

81 // When t r a v e r s a l o f next t imes t ep i s done , launch rayca s t i n g k e rne l

82 i f ( ! clManager −>Set In t ( ”TSPRaycaster” , t imestepArg , currentTimestep ) )

83 return fa l se ;

84

85 // Add b r i c k l i s t

86 i f ( ! clManager −>

87 AddBuffer ( ”TSPRaycaster” , b r i ckL i s tArg ,

88 reinterpret cast<void∗>(&(brickManager −>Br i ckL i s t ( currentBuf ) [ 0 ] ) ) ,

89 brickManager −>Br i ckL i s t ( currentBuf ) . s i z e ( ) ∗ s izeof ( int ) ,

90 CLManager : : COPY HOST PTR,

91 CLManager : :READONLY) ) return fa l se ;

92

93 i f ( ! clManager −>PrepareProgram ( ”TSPRaycaster” ) ) return fa l se ;

94

95 i f ( ! clManager −>LaunchProgram ( ”TSPRaycaster” ,

96 winWidth ,

97 winHeight ,

98 con f i g −>LocalWorkSizeX ( ) ,

99 con f i g −>LocalWorkSizeY ( ) ) )


101

102 // While the rayca s t e r k e rne l i s working , b u i l d next b r i c k l i s t and

s t a r t

103 // upload to the next PBO

104 i f ( ! brickManager −>Bui ldBr i ckL i s t ( nextBuf , br i ckReques t ) ) return fa l se

;

105 i f ( ! brickManager −>DiskToPBO( nextBuf ) ) return fa l se ;

106

107 // Fin i sh rayca s t e r and render curren t frame

78

A.7 Rendering Loop

108 i f ( ! clManager −>Re l ea s eBu f f e r ( ”TSPRaycaster” , b r i ckL i s tArg ) ) return

fa l se ;

109 i f ( ! clManager −>FinishProgram ( ”TSPRaycaster” ) ) return fa l se ;

110

111 // Render to f ramebu f f e r us ing quad

112 glBindFramebuffer (GL FRAMEBUFFER, SGCTWinManager : : In s tance ( )−>FBOHandle

( ) ) ;

113

114 i f ( ! quadTex −>Bind ( quadShaderProgram , ”quadTex” , 0) ) return fa l se ;

115

116 g lD i s ab l e (GL CULL FACE) ;

117

118 glUseProgram ( quadShaderProgram −>Handle ( ) ) ;

119 quadPos i t i onAtt r ib = quadShaderProgram −>GetAttr ibLocat ion ( ” po s i t i o n ” ) ;

120 i f ( quadPos i t i onAtt r ib == −1) {

121 ERROR(”Quad po s i t i o n a t t r i b u t e lookup f a i l e d ” ) ;


123 }

124 g lCul lFace (GL BACK) ;

125 glBindVertexArray (quadVAO ) ;

126 g lB indBuf f e r (GL ARRAY BUFFER, quadPosbuf ferObject ) ;

127 glEnableVertexAttr ibArray ( quadPos i t i onAtt r ib ) ;




131 g lDi sab leVertexAttr ibArray ( quadPos i t i onAtt r ib ) ;


133


135

136 i f (CheckGLError ( ”Quad render ing ” ) != GL NO ERROR) {


138 }

139

140 glUseProgram (0) ;

141

142 // Window manager t a k e s care o f swapping b u f f e r s

143

144 return true ;

145 }

79

Documents

Dynamisk visualisering av rymdvädersimuleringsdataliu.diva-portal.org/smash/get/diva2:763094/FULLTEXT01.pdf · Dynamisk visualisering av rymdvädersimuleringsdata Victor Sand 2014-05-16