View
47
Download
1
Category
Tags:
Preview:
DESCRIPTION
Introduction to the BinX Library. eDIKT project team Ted Wen tedwen@nesc.ac.uk Robert Carroll robertc@nesc.ac.uk. Agenda. About the BinX project A brief introduction to the BinX language Introduction to the BinX library Advanced API to the BinX library Use cases and requirements - PowerPoint PPT Presentation
Citation preview
Introduction to Introduction to the BinX Librarythe BinX Library
eDIKT project teameDIKT project team
Ted Wen Ted Wen tedwen@nesc.ac.uktedwen@nesc.ac.uk
Robert Carroll Robert Carroll robertc@nesc.ac.ukrobertc@nesc.ac.uk
AgendaAgenda
About the BinX projectAbout the BinX project A brief introduction to the BinX A brief introduction to the BinX
languagelanguage Introduction to the BinX libraryIntroduction to the BinX library Advanced API to the BinX libraryAdvanced API to the BinX library Use cases and requirementsUse cases and requirements
Dr Bob MannDr Bob Mann Dr Chris MaynardDr Chris Maynard
DiscussionDiscussion
About the BinX About the BinX projectproject
The problemThe problem
XML is useful to represent metadataXML is useful to represent metadata Scientific datasets can be too large in Scientific datasets can be too large in
XMLXML Most scientific data are in binary filesMost scientific data are in binary files Binary data files are not all Binary data files are not all
standardizedstandardized Binary data files are platform-Binary data files are platform-
dependentdependent
BinX – a solutionBinX – a solution
Initially designed for the Grid environmentInitially designed for the Grid environment Annotate data schema for any binary fileAnnotate data schema for any binary file Data elements are marked up in XMLData elements are marked up in XML Describe three levels of features in a Describe three levels of features in a
binary filebinary file Underlying physical representation (byte order)Underlying physical representation (byte order) Primitive data types (integer, float)Primitive data types (integer, float) Structure of the dataset (array, table)Structure of the dataset (array, table)
The BinX project at The BinX project at eDIKTeDIKT
Implementing a software library for Implementing a software library for BinXBinX
Develop a series of tools based on Develop a series of tools based on the librarythe library
Choose C++ for performanceChoose C++ for performance Write portable code for different Write portable code for different
platformsplatforms Robust and easy to useRobust and easy to use
Development statusDevelopment status
Requirement gathering from July Requirement gathering from July 20022002
Development started in October 2002Development started in October 2002 Prototype finished in December 2002Prototype finished in December 2002 Alpha version complete in April 2003Alpha version complete in April 2003 Beta version to be released in June Beta version to be released in June
20032003
The deliverablesThe deliverables
The BinX libraryThe BinX library Compiled code on different platformsCompiled code on different platforms Source code with Open Source licenseSource code with Open Source license
DocumentationDocumentation User’s guideUser’s guide Developer’s guideDeveloper’s guide
Utilities and examplesUtilities and examples
The BinX The BinX LanguageLanguage
What is BinX?What is BinX?
The Binary XML Description The Binary XML Description LanguageLanguage
A language for annotating binary data A language for annotating binary data filesfiles
It describes data types, data It describes data types, data structures and attributes such as byte structures and attributes such as byte orderorder
A BinX document is an XML file with A BinX document is an XML file with metadata of a binary data filemetadata of a binary data file
A BinX documentA BinX document <<dataset dataset
byteOrderbyteOrder=“bigEndian”>=“bigEndian”> <<definitionsdefinitions>>
<defineType <defineType typeNametypeName=“myTyp”>=“myTyp”>
<arrayFixed><arrayFixed> <character-8/><character-8/> <dim <dim indexToindexTo=“9”/>=“9”/>
</arrayFixed></arrayFixed> </defineType></defineType>
</</definitionsdefinitions>> <<filefile srcsrc=“=“myfile.binmyfile.bin”>”>
<useType <useType typeNametypeName=“myTyp”/>=“myTyp”/> <integer-32 <integer-32 varNamevarName=“X” />=“X” />
</</filefile>> </</datasetdataset>>
Root element
Data class section
Data instance section
Abstract data type
Data elementsData elements
Primitive data elementsPrimitive data elements Byte, character, integer, realByte, character, integer, real
Complex data elementsComplex data elements Arrays, struct, unionArrays, struct, union
User-defined data elementsUser-defined data elements
Primitive data typesPrimitive data types BitBit
<bit-1><bit-1> CharacterCharacter
<character-8><character-8> <unicodeCharacter-16><unicodeCharacter-16> <unicodeCharacter-32><unicodeCharacter-32>
IntegerInteger <byte-8><byte-8> <short-16>, <unsignedShort-16><short-16>, <unsignedShort-16> <integer-32>, <unsignedInteger-32><integer-32>, <unsignedInteger-32> <longInteger-64>, <unsignedLongInteger-64><longInteger-64>, <unsignedLongInteger-64>
RealReal <ieeeFloat-32><ieeeFloat-32> <ieeeDouble-64><ieeeDouble-64> <ieeeQuadruple-128><ieeeQuadruple-128>
Complex data typesComplex data types ArraysArrays
Repetitive collection of any data elementRepetitive collection of any data element MultidimensionalMultidimensional Three types of arraysThree types of arrays
Fixed length arrayFixed length array Variable-length arrayVariable-length array Streamed arrayStreamed array
StructStruct A sequence of data elementsA sequence of data elements
UnionUnion One of a group of possible data elements One of a group of possible data elements
conditional to the discriminantconditional to the discriminant
ArraysArrays Fixed-length arrayFixed-length array
<arrayFixed><arrayFixed> <ieeeDouble-64/><ieeeDouble-64/> <dim indexTo=“3” <dim indexTo=“3”
name=“X” />name=“X” /> <dim indexTo=“4” <dim indexTo=“4”
name=“Y” />name=“Y” /> <dim indexTo=“5” <dim indexTo=“5”
name=“Z” />name=“Z” /> </arrayFixed></arrayFixed>
Variable-length arrayVariable-length array <arrayVariable sizeRef=“byte-<arrayVariable sizeRef=“byte-
8”>8”> <ieeeFloat-32 /><ieeeFloat-32 /> <dim indexTo=“7”/><dim indexTo=“7”/> <dimVariable/><dimVariable/>
<arrayVariable><arrayVariable>
Streamed arrayStreamed array <arrayStreamed><arrayStreamed>
<byte-8/><byte-8/> <dimStreamed/><dimStreamed/>
</arrayStreamed></arrayStreamed>
StructStruct
<struct><struct> <short-16 varName=“ID” /><short-16 varName=“ID” /> <integer-32 varName=“Count” /><integer-32 varName=“Count” /> <ieeeDouble-64 varName=“Var” /><ieeeDouble-64 varName=“Var” />
</struct></struct>
UnionUnion <union><union>
<discriminant><discriminant> <byte-8/><byte-8/>
</discriminant></discriminant> <case discriminantValue=“32”><case discriminantValue=“32”>
<ieeeFloat-32 /><ieeeFloat-32 /> </case></case> <case discriminantValue=“64”><case discriminantValue=“64”>
<ieeeDouble-64 /><ieeeDouble-64 /> </case></case> <case discriminantValue=“0”><case discriminantValue=“0”>
<void-0 /><void-0 /> </case></case>
</union></union>
User-defined data typeUser-defined data type
<defineType <defineType typeName=“HeaderStruct”>typeName=“HeaderStruct”> <struct><struct>
<character-8 varName=“A”/><character-8 varName=“A”/> <character-8 varName=“B” /><character-8 varName=“B” /> <integer-32 varName=“Length” /><integer-32 varName=“Length” />
</struct></struct> <defineType><defineType>
Data elements as Data elements as instancesinstances
<file src=“myfile.bin”><file src=“myfile.bin”> <short-16 varName=“id”/><short-16 varName=“id”/> <arrayFixed varName=“name”><arrayFixed varName=“name”>
<character-8 /><character-8 /> <dim indexTo=“7” /><dim indexTo=“7” />
</arrayFixed></arrayFixed> <struct varName=“record”><struct varName=“record”>
<short-16 /><short-16 /> <ieeeFloat-32 /><ieeeFloat-32 />
</struct></struct> </file></file>
Reference defined Reference defined elementselements
<definitions><definitions> <defineType typeName=“A”><defineType typeName=“A”>
<struct><struct> <short-16/><short-16/> <integer-32/><integer-32/>
</struct></struct> <defineType><defineType>
</definitions></definitions>
<file src=“myfile.bin”><file src=“myfile.bin”> <useType typeName=“A” varName=“FirstUse”/><useType typeName=“A” varName=“FirstUse”/> <useType typeName=“A” varName=“SecondUse”/><useType typeName=“A” varName=“SecondUse”/>
</file></file>
The BinX LibraryThe BinX Library
Alpha versionAlpha version
Fundamental Fundamental requirementsrequirements
Access to data elements in binary files via Access to data elements in binary files via BinXBinX Parse the BinX documentParse the BinX document Build in-memory data structuresBuild in-memory data structures Read data values from the binary fileRead data values from the binary file
Automatic conversionAutomatic conversion Byte orderingByte ordering PaddingPadding
Producing BinX document and binary dataProducing BinX document and binary data Generate BinX document for data structuresGenerate BinX document for data structures Save assigned data values into binary filesSave assigned data values into binary files
General use casesGeneral use cases
Data conversion (byte order)Data conversion (byte order) Data extraction (sub-dataset)Data extraction (sub-dataset) Data combination (two arrays to Data combination (two arrays to
one)one) Data presentation (browse, pure Data presentation (browse, pure
XML)XML)
BinX ComponentsBinX Components The library has core functionality to support The library has core functionality to support
generic utilities and applicationsgeneric utilities and applications
Applications
Utilities
BinX LibraryCore
BinX core functionality Parse BinX document Read binary data
Generic tools Data conversion Extraction Packing/UnpackingApplications Domain-specific
The BinX library coreThe BinX library core Input: Input: SchemaBinXSchemaBinX, binary data file, binary data file Output: Output: DataBinXDataBinX, In-memory , In-memory
datasetdataset<dataset>… …</dataset>
<dataset>… …</dataset>
0101010101
0101010101
The BinX library
In-memoryData structure
(Values loadedon demand)
<short-16>100</short-16>
<short-16>100</short-16>
The BinX UtilitiesThe BinX Utilities
DataBinX generatorDataBinX generator DataBinX splitterDataBinX splitter SchemaBinX creatorSchemaBinX creator Binary file indexerBinary file indexer
DataBinX generatorDataBinX generator Put binary data inside XMLPut binary data inside XML
For browsing, web service return, query For browsing, web service return, query result setresult set
<dataset>… …</dataset>
<dataset>… …</dataset>
0101010101
0101010101
The BinX library
<short-16>100</short-16>
<short-16>100</short-16>
DataBinX splitterDataBinX splitter
The reverse of DataBinX generatorThe reverse of DataBinX generator Generate binary file for testing, Generate binary file for testing,
transportationtransportation Cross-platform (byte order)Cross-platform (byte order)
<dataset>… …</dataset>
<dataset>… …</dataset>
0101010101
0101010101
The BinX library
<short-16>100</short-16>
<short-16>100</short-16>
SchemaBinX creatorSchemaBinX creator
GUI and Web-based utilitiesGUI and Web-based utilities Build BinX document interactivelyBuild BinX document interactively Create a BinX document based on Create a BinX document based on
anotheranother
Binary file indexerBinary file indexer
Generating indices for binary data Generating indices for binary data filesfiles Such indices can be used for fast data Such indices can be used for fast data
accessaccess<dataset>… …</dataset>
<dataset>… …</dataset>
0101010101
0101010101
The BinX library
XY
00000004
Applications for Applications for astronomyastronomy
FITS and VOTable conversionFITS and VOTable conversion
DataBinX Utility
BinX libraryCore
SIMPLE = T… …END
01010101
SIMPLE = T… …END
01010101
<?xml version=.<VOTABLE>… …
</VOTABLE>
<?xml version=.<VOTABLE>… …
</VOTABLE>
FITS →DataBinX FITS →DataBinX →VOTable→VOTable
FITS to VOTable conversionFITS to VOTable conversion
DataBinx Utility
FITSFITS
SchemaBinX
SchemaBinX
Preprocessor
DataBinx
DataBinx
VOTable
VOTable
XSLTXSLT
XSLTtransformer
VOTable→DataBinX→FITVOTable→DataBinX→FITSS
VOTable to FITS conversionVOTable to FITS conversion
XSLTtransformer
VOTable
VOTable
XSLTXSLT
Preprocessor
DataBinx
DataBinx
FITSFITS
SchemaBinX
SchemaBinX
DataBinxUtility
BinaryData
BinaryData
Postprocessor
FITSHeader
FITSHeader
FITS-VOTable FITS-VOTable experimentexperiment
Sample FITS fileSample FITS file A data table of 82 rows X 20 fieldsA data table of 82 rows X 20 fields File size: 37KBFile size: 37KB
Generated DataBinx by DataBinx Generated DataBinx by DataBinx utilityutility Time spent: 268 msTime spent: 268 ms DataBinx document size: 1.2MBDataBinx document size: 1.2MB
VOTable transformed by MSXMLVOTable transformed by MSXML Time spent: about 1 secondTime spent: about 1 second VOTable document size: 51KBVOTable document size: 51KB
Possible future releasesPossible future releases
DataBinX parsingDataBinX parsing Utilities (GUI BinX editor)Utilities (GUI BinX editor) XPath-based data queryXPath-based data query DFDL supportDFDL support Preserving special tagsPreserving special tags
For comments, application-specific tags For comments, application-specific tags Text file supportText file support
Features or issues to Features or issues to considerconsider
Converting floating point numbersConverting floating point numbers 80-bit, 96-bit, 128-bit floating point80-bit, 96-bit, 128-bit floating point
Array manipulation (slice, section)Array manipulation (slice, section) SAX-based XML document parsingSAX-based XML document parsing
Use cases in place of DOM parsingUse cases in place of DOM parsing Built in the library or as add-on component?Built in the library or as add-on component?
Database supportDatabase support Annotating database tables?Annotating database tables? Query database tables through BinX?Query database tables through BinX?
Java version of the libraryJava version of the library Keeping exactly the same features with the C++ Keeping exactly the same features with the C++
version?version? Supporting XQuerySupporting XQuery
Query binary data files with XQuery on BinXQuery binary data files with XQuery on BinX
SupportSupport
For problems of usage:For problems of usage: http://www.edikt.org/binxhttp://www.edikt.org/binx (coming (coming
soon)soon) support@edikt.orgsupport@edikt.org
For requirements and suggestions:For requirements and suggestions: tedwen@edikt.orgtedwen@edikt.org robertc@edikt.orgrobertc@edikt.org
Recommended