9
COMPUTER PROCESSING AND MODELING - Full Papers Magnetic Resonance in Medicine 69:1768–1776 (2013) Gadgetron: An Open Source Framework for Medical Image Reconstruction Michael Schacht Hansen 1 * and Thomas Sangild Sørensen 2,3 This work presents a new open source framework for medical image reconstruction called the “Gadgetron.” The framework implements a flexible system for creating streaming data pro- cessing pipelines where data pass through a series of modules or “Gadgets” from raw data to reconstructed images. The data processing pipeline is configured dynamically at run-time based on an extensible markup language configuration description. The framework promotes reuse and sharing of reconstruction mod- ules and new Gadgets can be added to the Gadgetron framework through a plugin-like architecture without recompiling the basic framework infrastructure. Gadgets are typically implemented in C/C++, but the framework includes wrapper Gadgets that allow the user to implement new modules in the Python scripting language for rapid prototyping. In addition to the streaming framework infrastructure, the Gadgetron comes with a set of dedicated toolboxes in shared libraries for medical image recon- struction. This includes generic toolboxes for data-parallel (e.g., GPU-based) execution of compute-intensive components. The basic framework architecture is independent of medical imaging modality, but this article focuses on its application to Cartesian and non-Cartesian parallel magnetic resonance imaging. Magn Reson Med 69:1768–1776, 2013. © 2012 Wiley Periodicals, Inc. Key words: image reconstruction; open source; GPU INTRODUCTION Image reconstruction software is an integral part of all modern medical imaging devices, and medical image reconstruction research is a strong and active field with hundreds of articles published each year. In the field of magnetic resonance imaging (MRI), great advances have been driven by image reconstruction algorithms. Exam- ples include parallel MRI reconstruction (1–3) and more recently compressive sensing (4,5). Most image reconstruction algorithms are published without a reference implementation (i.e., without source code). In some cases, the authors—or the device vendors they collaborate with—are reluctant to share their algo- rithms. In many other cases, there is simply no practical 1 Division of Intramural Research, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland, USA 2 Department of Computer Science, Aarhus University, Aarhus, Denmark 3 Department of Clinical Medicine, Aarhus University, Aarhus, Denmark *Correspondence to: Michael S. Hansen, Ph.D., National Heart, Lung, and Blood Institute, NIH, NIH Building 10/B1D-405, 10 Center Drive, Bethesda, MD 20892. E-mail: [email protected] Disclosure: The National Heart, Lung, and Blood Institute and Siemens Medical Systems have a Cooperative Research and Development Agreement (CRADA). Received 16 March 2012; revised 25 April 2012; accepted 2 June 2012. DOI 10.1002/mrm.24389 Published online 12 July 2012 in Wiley Online Library (wileyonlinelibrary.com). way of sharing the algorithms; they may rely on a great deal of accessory code, some of which could be vendor spe- cific or even contain vendor-provided code that cannot be shared. Regardless of the reasons, it undermines the scien- tific process that readers and reviewers are prevented from reproducing the results of reconstruction research articles. It is exceedingly difficult for other researchers to evaluate how a given algorithm might perform given a different type of data or how it might interact with other algorithms. As a result, researchers who wish to build on previous work by other groups often have to reimplement previous work for comparison purposes, as there is no available common plat- form for sharing medical image reconstruction algorithms. Given the complexities of new reconstruction algorithms, it may be difficult or indeed impossible reimplement the methods described in research articles. This problem has been pointed out by other researchers, e.g. in the report by Buckheit and Donoho (6). The proof-of-concept implementations that form the basis of many publications have further limitations. They are often unsuitable for deployment in a clinical (or clin- ical research) environment where they could be tested in broader patient populations. More specifically, the reconstruction algorithm development is often done in “offline” applications that are difficult to integrate with medical devices in such a way that the reconstruction can be performed “online.” Consequently, proper eval- uation of a given reconstruction algorithm is not often done until a medical device vendor implements it in a commercial product. Furthermore, when developers wish to deploy their algorithms clinically they could be lim- ited by the device manufacturers reconstruction develop- ment environment. As manufacturers are under regulatory restrictions and consequently have long lead times on integration of the latest hardware and software in their platforms, delays in early testing of new algorithms can result. Moreover, a new algorithm may rely on more com- putational power than that is available from the manu- facturers hardware; the algorithm may rely on dedicated high-performance computing equipment such as graphics processing units (GPUs) (7–9), or the algorithm may simply need some software libraries that are not available or easy to integrate into a commercial medical device. Thus, it would be desirable to have an open source plat- form for implementing medical image reconstruction algo- rithms in general and more specifically MRI reconstruc- tion algorithms. To address the above-mentioned concerns, such a platform should have the following characteristics: Free. The platform should be freely available to all researchers. Specifically, it should be provided with © 2012 Wiley Periodicals, Inc. 1768

Image Reconstruction

Embed Size (px)

DESCRIPTION

image reconstruction

Citation preview

COMPUTERPROCESSING AND

MODELING -Full Papers

Magnetic Resonance in Medicine 69:1768–1776 (2013)

Gadgetron: An Open Source Framework for MedicalImage Reconstruction

Michael Schacht Hansen1* and Thomas Sangild Sørensen2,3

This work presents a new open source framework for medicalimage reconstruction called the “Gadgetron.” The frameworkimplements a flexible system for creating streaming data pro-cessing pipelines where data pass through a series of modulesor “Gadgets” from raw data to reconstructed images. The dataprocessing pipeline is configured dynamically at run-time basedon an extensible markup language configuration description. Theframework promotes reuse and sharing of reconstruction mod-ules and new Gadgets can be added to the Gadgetron frameworkthrough a plugin-like architecture without recompiling the basicframework infrastructure. Gadgets are typically implemented inC/C++, but the framework includes wrapper Gadgets that allowthe user to implement new modules in the Python scriptinglanguage for rapid prototyping. In addition to the streamingframework infrastructure, the Gadgetron comes with a set ofdedicated toolboxes in shared libraries for medical image recon-struction. This includes generic toolboxes for data-parallel (e.g.,GPU-based) execution of compute-intensive components. Thebasic framework architecture is independent of medical imagingmodality, but this article focuses on its application to Cartesianand non-Cartesian parallel magnetic resonance imaging. MagnReson Med 69:1768–1776, 2013. © 2012 Wiley Periodicals, Inc.

Key words: image reconstruction; open source; GPU

INTRODUCTION

Image reconstruction software is an integral part of allmodern medical imaging devices, and medical imagereconstruction research is a strong and active field withhundreds of articles published each year. In the field ofmagnetic resonance imaging (MRI), great advances havebeen driven by image reconstruction algorithms. Exam-ples include parallel MRI reconstruction (1–3) and morerecently compressive sensing (4,5).

Most image reconstruction algorithms are publishedwithout a reference implementation (i.e., without sourcecode). In some cases, the authors—or the device vendorsthey collaborate with—are reluctant to share their algo-rithms. In many other cases, there is simply no practical

1Division of Intramural Research, National Heart Lung and Blood Institute,National Institutes of Health, Bethesda, Maryland, USA2Department of Computer Science, Aarhus University, Aarhus, Denmark3Department of Clinical Medicine, Aarhus University, Aarhus, Denmark*Correspondence to: Michael S. Hansen, Ph.D., National Heart, Lung, andBlood Institute, NIH, NIH Building 10/B1D-405, 10 Center Drive, Bethesda,MD 20892. E-mail: [email protected]: The National Heart, Lung, and Blood Institute and Siemens MedicalSystems have a Cooperative Research and Development Agreement (CRADA).Received 16 March 2012; revised 25 April 2012; accepted 2 June 2012.DOI 10.1002/mrm.24389Published online 12 July 2012 in Wiley Online Library (wileyonlinelibrary.com).

way of sharing the algorithms; they may rely on a greatdeal of accessory code, some of which could be vendor spe-cific or even contain vendor-provided code that cannot beshared. Regardless of the reasons, it undermines the scien-tific process that readers and reviewers are prevented fromreproducing the results of reconstruction research articles.It is exceedingly difficult for other researchers to evaluatehow a given algorithm might perform given a different typeof data or how it might interact with other algorithms. As aresult, researchers who wish to build on previous work byother groups often have to reimplement previous work forcomparison purposes, as there is no available common plat-form for sharing medical image reconstruction algorithms.Given the complexities of new reconstruction algorithms,it may be difficult or indeed impossible reimplement themethods described in research articles. This problem hasbeen pointed out by other researchers, e.g. in the report byBuckheit and Donoho (6).

The proof-of-concept implementations that form thebasis of many publications have further limitations. Theyare often unsuitable for deployment in a clinical (or clin-ical research) environment where they could be testedin broader patient populations. More specifically, thereconstruction algorithm development is often done in“offline” applications that are difficult to integrate withmedical devices in such a way that the reconstructioncan be performed “online.” Consequently, proper eval-uation of a given reconstruction algorithm is not oftendone until a medical device vendor implements it in acommercial product. Furthermore, when developers wishto deploy their algorithms clinically they could be lim-ited by the device manufacturers reconstruction develop-ment environment. As manufacturers are under regulatoryrestrictions and consequently have long lead times onintegration of the latest hardware and software in theirplatforms, delays in early testing of new algorithms canresult. Moreover, a new algorithm may rely on more com-putational power than that is available from the manu-facturers hardware; the algorithm may rely on dedicatedhigh-performance computing equipment such as graphicsprocessing units (GPUs) (7–9), or the algorithm may simplyneed some software libraries that are not available or easyto integrate into a commercial medical device.

Thus, it would be desirable to have an open source plat-form for implementing medical image reconstruction algo-rithms in general and more specifically MRI reconstruc-tion algorithms. To address the above-mentioned concerns,such a platform should have the following characteristics:

• Free. The platform should be freely available to allresearchers. Specifically, it should be provided with

© 2012 Wiley Periodicals, Inc. 1768

Gadgetron 1769

full source code at no cost. Users should be allowedto modify, redistribute, repackage, and deploy theplatform with no restrictions.

• Modular. Except in the most trivial cases, medicalimage reconstruction algorithms consist of multiplesteps that must be performed accurately and in a timelyfashion to ensure adequate performance of the medi-cal imaging device. Many of these steps (e.g., basic datafiltering) are shared among many different reconstruc-tion algorithms and the platform should facilitate reuseof such components. This would enable developers tofocus on the new algorithm.

• Flexible. Although medical image reconstruction algo-rithms share many common components, there aresignificant differences between algorithms and datastructures for different imaging devices. For instance,MRI and positron emission tomography would likelyhave different data structures and different data flowsas dictated by the fact that MRI data are acquired (oftenline by line) in the spatial frequency domain (k-space)and positron emission tomography raw data would bea list of coincidence events at detectors surroundingthe sample. A practical platform for image reconstruc-tion should allow such differences in data types andalgorithms.

• Facilitate cross platform Development. The platformshould be operating-system independent such that theusers can then deploy it on their platform of choice,which may be dictated by the medical imaging deviceor their local computing environment.

• Facilitate prototyping. It is often easier to prototypenew reconstruction algorithms in high-level interac-tive environments such as Matlab and Python. Theplatform should support reconstruction algorithmspartly implemented in a scripting language, so changescan be made to the algorithm quickly and withoutrecompilation during the development process.

• Facilitate high-performance computing. GPUs andemerging many-core CPUs have added a powerfuland affordable high-performance computing tech-nology to the medical imaging reconstruction com-munity. A suitable reconstruction framework mustenable a seamless integration of such technologies.Moreover, developers should be able to integratehigh-performance computing technology that was notdefined at the time when the framework was devel-oped, i.e., a new type of parallel processor. This canbe achieved in part through a modular design whereindividual modules can be compiled and inserted inthe framework using a plugin architecture.

• Facilitate deployment. It should be possible to deploythe newly developed reconstruction algorithms in aclinical research environment. This deployment mustbe transparent to the users; it should be online in thesense that reconstructed images are created either con-currently with or immediately after data acquisitionsuch that the images can be used in the regular clinicalor research workflow.

This article describes a modular open source medi-cal image reconstruction framework called “Gadgetron,”which has been designed to encompass all of the

aforementioned requirements. The Gadgetron contains astreaming reconstruction pipeline in which data passesthrough a set of modules or “Gadgets.” The framework canbe configured dynamically by assembling and connectingthese Gadgets. In addition, the Gadgetron has a set of tool-boxes that provide common image reconstruction tools,data structures, iterative solvers, and GPU components.The architecture is described here in detail and frame-work functionality is illustrated with MRI reconstructionexamples.

ARCHITECTURE AND IMPLEMENTATION

The design of the Gadgetron is based on the observationthat most image reconstruction algorithms consist of a setof well-defined tasks that can be modularized and reusedin several different reconstruction scenarios. As a simpleexample, a k-space raw data Hamming window may beused in multiple different MRI reconstructions to reduceringing artifacts. One can think of each of these recon-struction modules as a Gadget that data are fed through;it has an input and an output. Multiple Gadgets are wiredtogether (connected) to assemble a reconstruction process(a Gadget stream). The Gadget stream is configured dynam-ically at run-time using a plugin architecture. More specif-ically, the Gadgets of a particular reconstruction streamcan reside in multiple independently compiled binaries(shared libraries) and are loaded at run-time based on aGadget stream configuration in an extensible markup lan-guage (XML) file. Data are passed from a given client intothe Gadgetron streaming architecture through a Transmis-sion Control Protocol/Internet Protocol (TCP/IP) connec-tion through which the resulting images are eventuallyreturned to the client application.

Each Gadget may implement a given task entirely, or itmay rely on a set of support libraries or “toolboxes.” TheGadgetron framework contains several toolboxes, whichprovide a wide range of functionality, e.g., basic datastructures such as multidimensional arrays (for CPUs andGPUs), Cartesian and non-Cartesian Fourier transforms,basic linear algebra subprograms (BLAS) (10) and LAPACK(11) interfaces, iterative solvers, etc. The toolboxes canof course also be used as shared libraries in independent“stand-alone” image reconstruction programs that do notuse the Gadgetron streaming architecture but provide amore traditional reconstruction executable for reconstruc-tion tasks.

The Gadgetron is implemented in C++ using severalexisting open source libraries. Network communications,streaming architecture, and message queues are imple-mented using the ADAPTIVE Communication Environ-ment (12), and some memory management features suchas smart pointers and Python interfacing are derivedfrom the Boost C++ Libraries. Some Gadgets are imple-mented on the GPU to achieve faster reconstructions. Allcurrent GPU examples in the Gadgetron framework areimplemented using Compute Unified Device Architecture(CUDA, Nvidia, Santa Clara, CA).

In the following sections, we provide design andimplementation details for the individual componentsof the Gadgetron framework. When referring to specific

1770 Hansen and Sørensen

FIG. 1. Gadgetron architecture. The Gadgetron is in communication with a client application through a TCP/IP connection. The clientapplication sends data to the Gadgetron and associated with each data package is a Message ID. Based on the Message ID, control of thesocket is handed over to a specific Reader, which is capable of deserializing the incoming data package. The data package is converted tomessage blocks that are added to the first Gadget’s queue. Data are then passed down the Gadget stream where each Gadget can modifyand transform the data. Any Gadget can return images (or partially processed data) to the Gadgetron framework. Based on the MessageID of this return data package, the control of the socket and the data are handed to a particular Writer, which is responsible for writing thereturn message to the client.

source code components such as class and func-tion names, a monospace font will be used, e.g.,GadgetStreamController.

Gadgets and the Streaming Reconstruction Pipeline

A schematic outline of the Gadgetron framework is shownin Fig. 1. A Gadgetron reconstruction process (Gadgetstream) consists of three main components: “Readers,”“Writers,” and “Gadgets.” These components are assem-bled and controlled by a GadgetStreamController,which is also responsible for maintaining the connectionwith the client. The role of the Reader is to receive anddeserialize data from the client, and the Writer serves the(opposite) role of serializing and returning data to theclient. Once the data has been received, it is passed downthe chain of Gadgets. Each Gadget can modify or transformthe data before passing it on to the subsequent Gadget. Thissections describes the function and relationship of thesecomponents in more detail.

The Gadgetron maintains communication with clientapplications through a TCP/IP socket. Connections withmultiple simultaneous clients are supported and each indi-vidual connection is managed by a software module (aclass) called GadgetStreamController. The communi-cation protocol with the client application is as follows:

1. A connection request from a client is received anda new instance of a GadgetStreamController iscreated.

2. The client sends an XML configuration file defin-ing the desired reconstruction pipeline to the Gad-getron. The GadgetStreamController loads therequired data serialization and deserialization com-ponents (Writers and Readers, see later) and Gadgetsfrom shared libraries as specified in the configurationfile.

3. The client sends an XML file with parameters for thereconstruction. This XML document is passed to eachindividual Gadget, which can interpret this documentin the context of the function it is responsible for.

4. The client starts streaming data to theGadgetStream-Controller, which uses its set of Readers to deseri-alize the data and pass it on to the Gadgets.

5. Images (or partially reconstructed datasets) arereturned to the GadgetStreamController from theGadgets. The GadgetStreamController uses Writ-ers to serialize the reconstructed images and passthem back to the client.

Associated with each Gadgetron stream is a set of Readersand Writers, which are responsible for the deserializa-tion and serialization of data packages on the TCP/IPconnection (see Fig. 1). As the XML configuration file isreceived from the client, the GadgetStreamControllerloads the corresponding set of Readers and Writers andassociates each Reader and Writer with a unique MessageID. Each transmission of data is initiated with transmissionof a Message ID such that the GadgetStreamControllercan pass control of the socket connection to the appro-priate Reader for deserialization of the data package.When returning data packages (images) to the client,the packages are read from an output queue where theyare again associated with a Message ID that allows theGadgetStreamController to pick the appropriate Writerfor serialization of the package. This abstraction of seri-alization and deserialization allows flexible configurationof the Gadgetron framework where several different typesof packages can be processed by the same Readers andWriters when applicable. Moreover, the behavior of theGadgetron framework in response to a specific type of datacan be changed by replacing the Reader/Writer associatedwith that type of data. For example, in one configura-tion the Gadgetron framework may pass a specific typeof MRI raw data directly to the Gadget stream and in

Gadgetron 1771

FIG. 2. Each Gadget is an active object with at least one thread ofexecution. The thread is responsible for picking generic data pack-ages of the input message queue and decoding the data packagesinto the data types that the Gadget is expecting. If the data typesmatch the Gadgets signature, the data will be passed on to a dataprocessing function where the data can be modified and/or trans-formed into new data packages and enqueued on the next Gadget’sinput queue.

another configuration the same data type may be correctedor filtered in some way depending on the Reader associ-ated with a specific Message ID. The Gadgetron frameworkprovides abstract classes for Readers/Writers and a specificdeserialization/serialization behavior can be achieved byimplementing these abstract interfaces. One of the tool-boxes (see later) contains a GadgetronConnector classthat allows the user to implement communication withthe Gadgetron framework by registering Readers and Writ-ers with the GadgetronConnector. Specific examples onhow to interface raw data and image data are contained inthe distribution.

The basic design of a Gadget is illustrated in Fig. 2.Each individual Gadget is an active object (13). It has itsown thread (or threads) of execution and an input messagequeue, which holds data packages from upstream Gadgets.The data packages are processed in the order they arriveat the Gadget. Each Gadget has a built-in system for check-ing that the data on the input queue is of the type that theGadget is expecting.

In many practical applications, a Gadget would operateon more than one data argument (e.g., it may operate on adata header and a multidimensional array containing theactual image data). The Gadgetron framework provides aset of templated Gadget classes that process different num-ber of arguments. Users of the Gadgetron framework canimplement a Gadget that processes two arguments by sim-ply inheriting from one of the templates and implementingjust one function (the process function).

Toolboxes

Core functionality to be shared across multiple Gadgets,and standalone applications is provided in toolboxes. Inthis section, we provide a nonexhaustive list of some keytoolbox functionality.

• Multidimensional arrays. The framework comes witha templated multidimensional array (NDArray). Thisarray structure is used throughout the Gadgetronframework to store and process data. Both CPU- andGPU-based arrays are available with operators to trans-parently handle memory transfers between the avail-able computational resources.

• Fast Fourier transforms. There are multiple toolboxcomponents that provide access to various flavors ofthe Fast Fourier Transform (FFT). CPU-based FFT isavailable using the FFTW3 library (14) and GPU-basedusing the CUFFT library from Nvidia. The frameworkalso contains a GPU implementation of the nonuni-form FFT (8).

• Iterative solvers. The framework provides a set of tem-plated classes for iterative reconstruction. A solverbase class is provided and a number of iterative solverimplementations are derived from this base class—such as a Conjugate Gradient solver for linear leastsquare problems (15) and Split Bregman solvers fortotal variation/compressed sensing based reconstruc-tions (16). The solvers are completely generic andstraightforward to configure for suitable reconstruc-tion problems. For example, the conjugate gradientsolver can be used to solve a minimization problemsuch as minρ ‖Eρ − m‖2

2 where ρ is the unknownimage to be reconstructed, E is the image modalityencoding matrix, and m is the measured data. Theuser must supply a matrix operator that implementsmultiplication with E and EH. This is done by fill-ing in (in a derived class) two pure virtual functionsof a linearOperator base class. The solvers allowmultiple operators to be added such that multipleterms can be minimized simultaneously. The sourcecode contains multiple examples for users to build onand modify for their own specific purposes. A spe-cific example of using these solvers for non-Cartesianparallel MRI is described later. In addition to imageencoding operators the framework provides a set ofcommonly used regularization operators such as anidentity operator, a partial derivative operator, a sim-ilarity image operator, and a convolution operator; allderived from the linearOperator base class.

Python Prototyping

To make the transition from prototyping to deployment eas-ier, the Gadgetron is designed so that the functionality ofindividual Gadgets can be implemented using Python as ascripting language. Python was chosen over Matlab, as itcan be obtained at no cost on all the supported platforms.The Python architecture is outlined in Fig. 3.

The Python scripting functionality can be accessed bythe user through a PythonGadget, and the framework pro-vides certain hooks that the user can implement callbackfunctions for in a Python module. The user specifies whichPython module the Gadget should load and which func-tions in the Python module should be called in response towhich events.

The PythonGadget is not responsible for loading thePython modules. Due to the way that the Python interpreteris implemented, it is not safe for multiple Gadgets to access

1772 Hansen and Sørensen

FIG. 3. Python prototyping architecture. The Gadgetron frame-work provides a specialized Gadget (PythonGadget). ThePythonGadget communicates with a Python interpreter instancethrough a PythonCommunicator object (a process wide singleton),which ensures that only one Gadget attempts to access the inter-preter at any given time. At run-time, the PythonGadgetwill instructthe PythonCommunicator to load (import) a Python module in theinterpreter instance as specified in the XML configuration file. Whendata arrive in the Gadget, they are passed on to the Python module(through the PythonCommunicator). Each loaded Python modulereceives a reference to the PythonGadget such that reconstruc-tion results can be passed back to the calling Gadget and continuedown the stream. As indicated, a given Gadget stream can containa mixure of regular C/C++ Gadgets and Python Gadgets.

the interpreter at the same time and consequently the com-munication with the Python interpreter is centralized inthe PythonCommunicator (a process wide singleton forthe Gadgetron). When the PythonGadget loads, it willrequest that the PythonCommunicator loads the Pythonmodule. When data need to be passed to the Python mod-ule, it will be passed first to the PythonCommunicator,which will pass it on to the appropriate Python modulewhen the Python interpreter becomes available.

As indicated in Fig. 3, it is possible to have an arbitrarynumber of PythonGadgets in the reconstruction pipeline.Moreover, the PythonGadgets can be mixed with stan-dard Gadgets implemented purely in C/C++. This enablesthe user to reuse existing, efficient implementations ofreconstruction steps while maintaining the capability ofprototyping in Python.

Standalone Compilation

The core reconstruction algorithm components includedin the framework (e.g., FFTs, iterative solvers, etc.) are con-tained in the various toolboxes (see earlier) and compiledinto shared libraries. As previously mentioned, these tool-boxes can either be linked into Gadgets or be used in stan-dalone applications. To exemplify how to use these tool-boxes in practice, we include a number of small standaloneapplications. These applications are compiled outside thestreaming client/server architecture of the Gadgetron (Fig.1) and demonstrate how to use the toolboxes in third-partyapplications.

EXAMPLE APPLICATIONS

This section outlines some of the example applications thatare included in the initial Gadgetron distribution. More

algorithms will be added in future releases and made avail-able continuously on the Gadgetron web page (see later). Itis beyond the scope of this article to explain the algorithmsin detail, but references to some key reconstruction stepsare provided in the following. We use MRI as the primarydemonstration modality.

Real-Time Cartesian Parallel Imaging

The Gadgetron includes a GPU-based real-time reconstruc-tion engine for Cartesian parallel MRI. The implementedalgorithm is an extension of the GRAPPA algorithm (3),which has been optimized for high-throughput reconstruc-tion of real-time MRI using a large number of receivechannels. To our knowledge, the closest implementation inthe literature is the one by Saybasili et al. (17). In the presentGadgetron implementation, the GRAPPA convolution ker-nels are Fourier transformed to image space and applieddirectly to the aliased images using pixelwise multiplica-tions. Additionally, the image space unmixing coefficientsfor individual channels are combined in image space usinga B1-weighted coil combination procedure as per Walshet al. (18). Data are assumed to be acquired in a time-interleaved fashion, so that a number of neighboring framescan be averaged and used for calibration data (19).

Figure 4 outlines the Gadgetron GRAPPA reconstruc-tion chain. The core algorithm is implemented in theGrappaGadget and the GrappaUnmixingGadget. TheGrappaGadget is responsible for monitoring the acquiredraw data and will trigger an update of GRAPPA coef-ficients when required (and data are available). TheGRAPPA unmixing coefficients and the required coil sen-sitivity maps for B1-weighted coil combination are cal-culated on the GPU by the GrappaCalculator. Oncethe GRAPPA coefficients have been calculated, they aredownloaded to the CPU where they are stored in a mem-ory structure that can be accessed by the downstreamGrappaUnmixingGadget. The GrappaUnmixingGadgetperforms the Fourier transform of k-space data and linearcombination of the available coils using the coefficientscalculated by the GrappaCalculator. The GRAPPA cal-culation is done asynchronously with Fourier transformand image unmixing, which enables this configurationto have a high frame rate. The unmixing coefficients areupdated as often as the hardware is capable and alwayswhen the slice orientation changes. The time that it takesto update the unmixing coefficients depends on the num-ber of receiver channels, the image matrix size, and theavailable hardware, but for most applications it is on theorder of 200 ms. More specifically, on a 16 CPU core(Intel Xeon 2.67 GHz) computer with 24 GB of RAM andan Nvidia GeForce GTX 590 graphics card, the GRAPPAunmixing coefficients for parallel imaging rate 4 with 16input channels could be calculated in under 200 ms. As thecoefficients are updated independently of the unmixing,the framerate is not limited by how quickly the coefficientscan be calculated. The framerate is determined by howquickly the raw data can be Fourier transformed and thecoefficients can be applied. With the previously describedhardware, images could be displayed on the scanner withless than one frame (approximately 100 ms) latency.

Gadgetron 1773

FIG. 4. GRAPPA Gadget chain. NoiseAdjustGadget andPCAGadget create virtual channels in which the noise is white andthe data have undergone a principal component decomposition. Anoptional coil reduction stage is then used to remove the PCA coils.The GRAPPA coefficients are calculated in k-space but Fouriertransformed to image space where they are combined into oneset of unmixing coefficients as described in Ref. 17. The GRAPPAcoefficient calculation also includes an optional target channelreduction for further speed-up of the reconstruction. The calculationof unmixing coefficients is asynchronous with the stream of rawdata, and the coefficients (weights) are updated as frequently aspossible. Sample reconstruction images are shown on the bottomright.

In addition to the high-throughput GRAPPA functional-ity described earlier, the implementation also allows forreceiver channel compression based on Principal Compo-nent Analysis (PCA) (20,21). The implemented scheme isinspired by Huang et al. (22) in that it implements twostages of channel compression. After noise prewhiteningin the NoiseAdjustGadget, the channels are converted tovirtual channels using PCA coefficients in the PCAGadget.In the present implementation, the coefficients are cal-culated based on the k-space data from the first frameof real-time data. Subsequently, the data pass through acoil reduction Gadget, which simply discards data abovea certain channel number (here the number of channels isreduced from 32 to 16). In the GRAPPA calculation itself,further channel compression is possible by limiting the

number of target channels included in the GRAPPA recon-struction. For example, the GRAPPA coefficients may becalculated from 16 source channels to eight target chan-nels. This reduces the computational load and memoryrequirements for the GRAPPA calculation without explic-itly reducing the parallel imaging performance. It is beyondthe scope of this article to do an analysis of appropriate set-tings for these compression factors, but in practice we havefound that with a 32-channel receive array, a reduction to 16channels upstream and eight channels downstream resultsin only marginal reduction of image quality.

Non-Cartesian Parallel MRI

As an example of parallel MRI using non-Cartesian tra-jectories, the Gadgetron includes a GPU-based implemen-tation of non-Cartesian SENSE (23). The implementationincluded in the Gadgetron is based on previous work (9) butadapted for the Gadgetron architecture. The main function-ality of the iterative solver is implemented in the toolboxesdescribed earlier. The Gadget architecture is used to wrapthe solver and data handling buffers in a reusable manner.

The conjugate gradient SENSE Gadget pipeline is illus-trated in Fig. 5. The noise adjustment and PCA virtual coilgeneration Gadgets have been reused from the GRAPPAreconstruction described earlier. Additionally, a series ofother Gadgets for image scaling, magnitude extraction, etc.are also reused. The non-Cartesian SENSE reconstructionuses a conjugate gradient iterative solver to solve:

ρλ = arg minρ

{‖Eρ − m‖22 + λ‖Lρ‖2

2

}, [1]

where ρ is the unknown image to be reconstructed, m isthe measured (possibly undersampled) data, and E is theencoding matrix describing (a) multiplication with coilsensitivities, (b) Fourier transform to k-space, and (c) sam-pling on the non-Cartesian trajectory. The matrix L is a diag-onal regularization matrix, and λ is a regularization weightthat controls the balance between the data consistency andregularization terms in the optimization (9).

In practice, information about coil sensitivity distribu-tions is needed to apply E and EH during the iteration.To facilitate computation of an estimate hereof the Gadgetholds a queue for buffering the data as they are received.The data are assumed to be acquired in such a way that aset of neighboring frames can be averaged to yield a fullysampled (but low temporal resolution) dataset. This can beachieved with a rotating radial acquisition scheme or withthe golden ratio radial acquisition scheme (24), which wasused in this case. Data from previous frames are bufferedin the conjugate gradient SENSE Gadget and before recon-struction of a frame, new coil sensitivity maps are formedfrom the buffered data. Based on this unaliased dataset, itis also possible to extract a regularization mask that is usedto populate the diagonal matrix L.

The conjugate gradient SENSE Gadget has several tun-able parameters that can be used to trade off image quality,signal-to-noise ratio, frame rate, and computational speed.In practice, the user can control how many k-space pro-jections are used in each reconstructed frame (temporalresolution), how many data acquisitions that are sharedamong neighboring frames, the regularization factor λ, the

1774 Hansen and Sørensen

FIG. 5. Non-Cartesian parallel MRI Gadget chain. Most of the recon-struction pipeline is reused from the Cartesian parallel MRI Gadgetchain (Fig. 4). After forming virtual channels, the data pass into theconjugate gradient SENSE Gadget where they are buffered. Onceenough data are available to reconstruct a new frame, coil sensi-tivities and regularization data are calculated from a time average ofpreviously acquired data and used along with the undersampled datain a conjugate gradient iterative solver. The reconstructed images arepassed on to subsequent Gadgets in the chain for further processing.

number of iterations and stopping criterion in the conjugategradient solver. Furthermore, the user can take advantage ofthe PCA channel compression to tune the reconstruction tothe performance of the available reconstruction hardware.In the current implementation, we were able to performnon-Cartesian parallel imaging reconstruction with over 10frames per second using a single GPU for a resolution of192 × 192. It should be noted that PCA coil reduction from32 to 8 virtual channels was used to achieve this framerate.

Further Solver Examples

A frequently occurring image reconstruction task is toreduce the degree of noise and/or blurring in an image. The

latter often under the assumption that the image degrada-tion was caused by convolution of the desired image with aknown point spread function. The initial Gadgetron releaseprovides three iterative solvers that can be used to restorenoisy or blurred images; linear least squares by the con-jugate gradient algorithm (15), and both an unconstraintand a constraint Split-Bregman solver (16) for total vari-ation based minimization. The solvers are derived fromvariational problems minimizing, respectively

minρ

‖∇ρ‖22 + λ‖Eρ − m‖2

2

minρ

|ρ|TV + λ‖Eρ − m‖22

minρ

|ρ|TV s.t. ‖Eρ − m‖22 < σ2

where ρ is the desired image to be restored, E is the lin-ear degradation operator, and m is the acquired image.For deblurring, E is modeled as a ConvolutionOperatorand for denoising as an IdentityOperator. The solverscan be set up with just a handful of lines of code andGPU-based operators are available for high-performanceconsiderations.

The framework also includes a small standalone applica-tion that applies the non-Cartesian parallel MRI encodingoperator (see earlier) on the latter two total variation (TV)based functionals demonstrating compressed sensing incombination with parallel MRI.

Integration with a Clinical MRI System

One of the key goals in designing the Gadgetron frameworkwas to enable easy deployment in a clinical research envi-ronment. This is facilitated by the flexible TCP/IP interfacethat allow clients to connect from any device where it ispossible to open a network socket. The exact steps involvedin communication between a given device and the Gad-getron framework will vary greatly from device to device.Here an example of interfacing a clinical MRI scanner(Siemens Avanto 1.5T and Siemens Espree 1.5T, SiemensMedical Solutions, Erlangen, Germany) with the Gadgetronis demonstrated. The connection between scanner andGadgetron is outlined in Fig. 6.

In this example, the Siemens reconstruction pipelinewas modified by added modules that communicatewith the Gadgetron. Two new modules were added: aGadgetronEmitter and a GadgetronInjector. Thesemodules can be inserted anywhere in the vendor recon-struction pipeline by specifying the insertion points in aconfiguration file. The GadgetronEmitter is responsiblefor opening a connection (TCP/IP) to the Gadgetron andsending parameters and raw data. When the images havebeen reconstructed, they will be returned to the scannerreconstruction pipeline using the GadgetronInjector.As it is possible to return reconstructed images to the scan-ner reconstruction pipeline before images are send to theimage database on the MRI scanner host, seamless integra-tion with the clinical scanner is achieved. Example videoscan be found at http://gadgetron.sourceforge.net/demo.

Gadgetron 1775

FIG. 6. Integration with a Clinical MRI Scanner (SiemensAvanto/Espree, Siemens Medical Solutions, Erlangen, Germany).The reconstruction software of the MRI system was modified byadding two additional modules to the reconstruction vendor recon-struction pipeline. One module (GadgetronEmitter) was respon-sible for opening a connection to the Gadgetron and transmittingdata, another module was responsible for receiving reconstructedimages from the Gadgetron. The modules for sending and receiv-ing data were implemented in such a way that they can be addedto any existing reconstruction program on the scanner withoutthe need to recompile any MRI scanner software. The location ofthe GadgetronEmitter and GadgetronInjector in the vendorreconstruction chain is flexible and can be specified with a config-uration file. The original vendor reconstruction modules (betweenGadgetronEmitter and GadgetronInjector) are bypassed bythe Gadgetron reconstruction.

AVAILABILITY AND PLATFORM SUPPORT

The Gadgetron is available under a permissive, free soft-ware license based on the Berkeley Software Distributionlicense, which allows the users to use, modify, and redis-tribute the software in source or binary form with virtuallyno restrictions. The source code and documentation can befound on the Sourceforge Open Source distribution website(http://gadgetron.sourceforge.net).

The framework was designed to be cross platform and thesource code compiles and runs on Linux, Mac OS X, andMicrosoft Windows (Windows 7, 64-bit). Great care wastaken during the implementation to ensure that the stream-ing architecture with dynamic run-time linking works onall platforms. Installation instructions are available on theSourceforge website.

DISCUSSION

Limitations and Future Directions

The Gadgetron is implemented mainly in C++ and takesadvantage of many specific language features such as objectinheritance and templates. Consequently, a certain profi-ciency with these features is an advantage when using theGadgetron, and this may make it harder for programmerswho are less familiar with C++ programming to use theGadgetron. However, as a Gadget can be implemented bywriting just a single function (the process class member

function), users with limited C++ knowledge are expectedto be able to use the Gadgetron framework. The documenta-tion contains multiple examples of how to implement newGadgets.

There is a certain implementation overhead involvedin working with the Gadgetron streaming architecture.A given algorithm has to be separated into components(Gadgets), and a client has to be written (if the includedclients are not adequate). This can be more involved thanwriting a simple standalone C/C++ application. Further-more, there is an inherent overhead associated with theTCP/IP communication that one would not incur in astandalone application. However, it is important to notethat there are some specific advantages as well. First, theindividual modules can be reused in other reconstructionprograms, potentially reducing future development time.Second, the Gadgetron streaming framework is inherentlymultithreaded, because each Gadget has its own thread ofexecution, which can provide a performance increase tooffset the TCP/IP overhead. In fact, in simple tests con-ducted to explore this, the Gadgetron streaming frameworkprovided better performance for identical reconstructionswhen compared to a single threaded standalone applica-tion. It should also be noted that the toolboxes includedwith the Gadgetron can be used in standalone applications,which is illustrated with multiple examples in the sourcecode.

The framework is envisioned as a platform in contin-uous development. Functionality will be added throughnew additional Gadgets, but there are also infrastruc-tural changes planned that will enable a wider range ofapplications.

Currently, it is not possible to create branches in a Gad-getron stream, i.e., it is not possible to create streams wheredata pass through different sets of Gadgets based on a condi-tion. For the applications we have focused on so far, this hasnot been an issue, but it is easy to envision other applica-tions in which it would be advantageous to create multiplebranches (e.g., to process different types of images). Suchfunctionality is planned for an upcoming release.

A related limitation is that there is currently no sharedmemory storage between different instances of Gadgetchains. Such functionality would be useful, if multipleGadget chains were active simultaneously and some datawere shared between the chains (e.g., calibration data).Currently, this can be achieved using the file system orsimilar means of sharing data between processes, but inthe future, we plan to have a more straight forward globaldata storage with life cycle management of the stored dataobjects.

The GPU components in the current version of the Gad-getron are based on Nvidia’s CUDA framework, which islimited to Nvidia’s GPUs. To make the GPU componentshardware vendor independent, it would be an advantageto use OpenCL instead. At present, however, the OpenCLarchitecture is less feature rich than CUDA and morecumbersome to use in templated class hierachies, and con-sequently the framework has limited its scope to CUDA.There are, on the other hand, no fundamental restrictionsin the framework for implementing Gadgets and toolboxesusing OpenCL, and future releases may be extended withOpenCL support.

1776 Hansen and Sørensen

In the current implementation, there are some limita-tions in the Python scripting support. As described, theaccess to the Python interpreter is controlled such that twoGadgets cannot access the interpreter simultaneously. Thiscould have performance implications if multiple Gadgetsare implemented in Python and has to compete for the inter-preter. With the current Python C/C++ API (version 2.7.3),it has not been possible to overcome this limitation in away that would work robustly on all supported platforms.A future release will seek to improve the Python supportand may also provide support for other scripting languagessuch as Matlab.

Finally, the main application of the framework so farhas been MRI reconstruction. Consequently, the toolboxesare focused on tasks related to this particular modality.We anticipate that the contents of the toolboxes will nat-urally evolve, as we and other developers embrace newapplications and imaging modalities. For example, we areexploring using the framework for deblurring and denois-ing of light microscopy images using the iterative solversin the toolboxes.

CONCLUSIONS

We have presented a new open source framework formedical image reconstruction and described several appli-cations for which it can be used (MRI, image denoising, andimage deblurring). The architecture is modular, and flexi-ble and promotes reuse of existing reconstruction softwaremodules in the form of Gadgets. It is possible to implementnew Gadgets in C/C++ with integration of GPU accelera-tion for high frame rate, low latency reconstructions. It isalso possible to prototype new reconstruction componentsusing the high-level scripting language Python. The frame-work and all the example applications are made freelyavailable to the medical image reconstruction community,and we hope that it can serve as a platform for researchersto share and deploy novel reconstruction algorithms.

ACKNOWLEDGMENTS

The authors thank Drs. Peter Kellman and Souheil Inati atthe National Institutes of Health and David Hansen, Chris-tian Christoffersen, and Allan Rasmusson at the Depart-ment of Computer Science, Aarhus University for valuablediscussion, feedback, and suggestions.

REFERENCES

1. Pruessmann KP, Weiger M, Scheidegger MB, Boesiger P. SENSE: sensi-tivity encoding for fast MRI. Magn Reson Med 1999;42:952–962.

2. Sodickson DK, Manning WJ. Simultaneous acquisition of spatial har-monics (SMASH): fast imaging with radiofrequency coil arrays. MagnReson Med 1997;38:591–603.

3. Griswold MA, Jakob PM, Heidemann RM, Nittka M, Jellus V, WangJ, Kiefer B, Haase A. Generalized autocalibrating partially parallelacquisitions (GRAPPA). Magn Reson Med 2002;47:1202–1210.

4. Donoho D. Compressed sensing. IEEE Trans Inform Theory2006;52:1289–1306.

5. Lustig M, Donoho D, Pauly JM. Sparse MRI: the application of com-pressed sensing for rapid MR imaging. Magn Reson Med 2007;58:1182–1195.

6. Buckheit JB, Donoho DL. WaveLab and Reproducible Research. Techni-cal Report 474. Department of Statistics, Stanford University, 1995.

7. Hansen MS, Atkinson D, Sorensen TS. Cartesian SENSE and k-t SENSEreconstruction using commodity graphics hardware. Magn Reson Imag-ing 2008;59:463–468.

8. Sorensen TS, Schaeffter T, Noe KO, Hansen MS. Accelerating the noneq-uispaced fast Fourier transform on commodity graphics hardware. IEEETrans Med Imaging 2008;27:538–547.

9. Sørensen TS, Atkinson D, Schaeffter T, Hansen MS. Real-time recon-struction of sensitivity encoded radial magnetic resonance imag-ing using a graphics processing unit. IEEE Trans Med Imaging2009;28:1974–1985.

10. Dongarra J, DuCroz J, Hammarling S, Duff I. A set of level 3 basic linearalgebra subprograms. ACM Trans Math Softw 1990;16:1–17.

11. Anderson E, Bai Z, Bischof C, Blackford S, Demmel J, Dongarra J,DuCroz J, Greenbaum A, Hammarling S, McKenney A, Sorensen D.LAPACK users’ guide, 3rd ed., Philadelphia, PA: Society for Industrialand Applied Mathematics; 1999.

12. Schmidt DC. The ADAPTIVE communication environment: object-Oriented Network Programming Components for DevelopingClient/Server Applications. In: Proceedings of the 11th AnnualSun Users Group Conference, 1993. pp 214–225.

13. Lavender RG, Schmidt DC. Active object: an object behavioral patternfor concurrent programming. In: Vlissides JM, Coplien JO, Kerth NL,editors. Pattern languages of program design 2. Boston: Addison-Wesley,1996, Chapter 30.

14. Frigo M, Johnson SG. The design and implementation of FFTW3.Proc IEEE 2005;93:216–231. Special issue on “Program Generation,Optimization, and Platform Adaptation”.

15. Golub G, van VanLoan C. Matrix computations, 3rd ed., Baltimore, MD:The Johns Hopkins University Press; 1996, Chapter 10, pp 520–530.

16. Goldstein T, Osher S. The split bregman method for L1-regularizedproblems. SIAM J Imaging Sci 2009;2:323–343.

17. Saybasili H, Kellman P, Griswold MA, Derbyshire JA, Guttman MA.HTGRAPPA: real-time b1-weighted image domain tgrappa reconstruc-tion. Magn Reson Med 2009;61:1425–1433.

18. Walsh DO, Gmitro AF, Marcellin MW. Adaptive reconstruction ofphased array MR imagery. Magn Reson Med 2000;43:682–690.

19. Kellman P, Epstein FH, McVeigh ER. Adaptive sensitivity encod-ing incorporating temporal filtering (TSENSE). Magn Reson Med2001;45:846–852.

20. Huang F, Vijayakumar S, Li Y, Hertel S, Duensing GR. A software channelcompression technique for faster reconstruction with many channels.Magn Reson Imaging 2008;26:133–141.

21. Buehrer M, Pruessmann KP, Boesiger P, Kozerke S. Array compressionfor MRI with large coil arrays. Magn Reson Med 2007;57:1131–1139.

22. Huang F, Lin W, Duensing GR, Reykowski A. A hybrid method for moreefficient channel-by-channel reconstruction with many channels. MagnReson Med 2012;67:835–843.

23. Pruessmann KP, Weiger M, Börnert P, Boesiger P. Advances in sensi-tivity encoding with arbitrary k-space trajectories. Magn Reson Med2001;46:638–651.

24. Winkelmann S, Schaeffter T, Koehler T, Eggers H, Doessel O. An optimalradial profile order based on the golden ratio for time-resolved MRI. IEEETrans Med Imaging 2007;26:68–76.