View
222
Download
0
Category
Preview:
Citation preview
discoveryHubdiscoveryHub™™ : What it Is: What it Is
?? Data Integration PlatformData Integration PlatformFlexible Query EngineFlexible Query Engine
Powerfully Simple Data ModelPowerfully Simple Data Model
Mediation LayersMediation Layers
Access LayersAccess Layers
?? NOT a Database!NOT a Database!works with RDBMS, ODBMS, anythingworks with RDBMS, ODBMS, anything……
?? NOT the Replacement for Oracle, LIMS orNOT the Replacement for Oracle, LIMS or……Compliment to acquisition and storage systemsCompliment to acquisition and storage systems
?? NOT One size fits all, all youNOT One size fits all, all you’’ll ever needll ever needBecause there is no such thingBecause there is no such thing
?? NOT Any part of a BicycleNOT Any part of a Bicycle““HubHub”” as in that from which spokes reach outas in that from which spokes reach out
discoveryHubdiscoveryHub™™ : What it is Not.: What it is Not.
Query EngineQuery EngineFlexible RealFlexible Real-- time data accesstime data access
Integrate disparate data sourcesIntegrate disparate data sources
Automate integration and transformationAutomate integration and transformation
Disparate Information SourcesDisparate Information Sources
?? Distributed and Heterogeneous Distributed and Heterogeneous
?? Number of interesting sources very largeNumber of interesting sources very large
?? Sources managed independentlySources managed independently
?? Rapid ChangeRapid Change
?? Requires a dynamic solutionRequires a dynamic solution
Complex Data HandlingComplex Data Handling
?? Nested Collections and Complex ObjectsNested Collections and Complex ObjectsEnables dealing inherently with heterogeneous, Enables dealing inherently with heterogeneous, hierarchical structured and unstructured data.hierarchical structured and unstructured data.
?? Flexible data model: manage complex things Flexible data model: manage complex things simply.simply.
?? Model the problem as it appears, naturallyModel the problem as it appears, naturallyNo more complexity than is neededNo more complexity than is needed
?? Functional Query SystemFunctional Query System
Simplified Data IntegrationSimplified Data Integration
Model complex nested Model complex nested collections naturallycollections naturally
drill into drill into heterogenousheterogenous, , disparate nested disparate nested structures directlystructures directly
APIsAPIs
?? Programmatic Access to Programmatic Access to dHdH systemsystemJava, Enterprise JavaJava, Enterprise Java
Dot.NETDot.NET (C# , J# , (C# , J# , ……))
DHTML/JavaScriptDHTML/JavaScript
CGICGI
PerlPerl
?? Easily extend applications with data Easily extend applications with data integrationintegration
APIsAPIs
?? Programmatic Access to Programmatic Access to dHdH systemsystemJava, Enterprise JavaJava, Enterprise Java
Dot.NETDot.NET (C# , J# , (C# , J# , ……))
DHTML/JavaScriptDHTML/JavaScript
CGICGI
PerlPerl
?? Easily extend applications with data Easily extend applications with data integrationintegration
LinksLinks
?? Discovery Hub marketing materialDiscovery Hub marketing material?? http://www1.amershambiosciences.com/aptrix/upp01077.nsf/Content/http://www1.amershambiosciences.com/aptrix/upp01077.nsf/Content/ss
cierra_discoveryhub_overviewcierra_discoveryhub_overview
?? Or from Or from http://http://www.amershambiosciences.comwww.amershambiosciences.com click click ScierraScierradiscoveryHubdiscoveryHub
?? Discovery Hub support pageDiscovery Hub support page?? Log into Log into http://http://www.amershambiosciences.com/scierrawww.amershambiosciences.com/scierra
?? discovery Hub pane should be on bottom left panediscovery Hub pane should be on bottom left pane
API OverviewAPI Overview
?? Consistent ArchitectureConsistent ArchitectureCreate/Manage connectionCreate/Manage connection
Execute serverExecute server--side processesside processes
Get and use resultsGet and use results
Java, Enterprise JavaJava, Enterprise JavaDot.NETDot.NET (C#, J#, (C#, J#, ……))DHTML/JavaScriptDHTML/JavaScriptPerlPerl
General General EampleEample: Java: Java
?? Define connectionDefine connection
?? Get the connection Get the connection and connect.and connect.
?? execute commandsexecute commands
?? do something with do something with resultsresults
connectionString = "type=SOCKET server=technet.geneticXchange.com";
Connection myConnection = null;
Properties p = ConnectionFactory.parseProperties(connectionString, false);
myConnection = ConnectionFactory.create(p);
myConnection.connect();
String cmdToRun = "select (#uid: x.uid, #feature: x.feature) from na-get-seqfeat-by-uid(12354);”);
String results =myConnection.executeAndReadRaw(cmdToRun);
myConnection.disconnect();
import k1connection.*;import java.util.Properties;
/*** This is a very simple sample class to demonstrate use of the k1connection* package.*/public class SampleClient{public SampleClient(){
// Connection string for socket connnection to dev serverconnectionString = "type=SOCKET
server=technet.geneticXchange.com";
Connection myConnection = null;try {// Get the connection type from the command and build the rest of the // options into a Properties set. Get the connection and connect.// display information about the connectionProperties p = ConnectionFactory.parseProperties(connectionString,
false);myConnection = ConnectionFactory.create(p);myConnection.connect();System.out.println(myConnection.getConnectionInfo());
}catch (ConnectionException e) {myConnection = null;System.out.println(e.getMessage());return;
}
try {// Run a simple commandSystem.out.println(myConnection.executeAndReadRaw("{1,2,3,4,5};"));
}catch (ConnectionException e) {System.out.println(e.getMessage());return;
}
try {// DisconnectmyConnection.disconnect();
}catch (ConnectionException e) {System.out.println(e.getMessage());
}
}
public static void main(String[] args){
String connectionString = null;
if (args.length > 0){connectionString = args[0];
}
SampleClient it = new SampleClient(connectionString);
}}
Simple Java Example
JavaScript/DHTML APIJavaScript/DHTML API
?? Define ConnectionDefine Connection
?? Make connectionMake connection
?? Run Run sSQLsSQL
?? Handle ResultsHandle Results
mydh = new dHubConnection(params);mydh.connect(constring);
res = myConnection.executeScriptXML(scriptName, args);
htmlstring = myConnection.formatXML(res); dhresult = new dhResultXMLProcessor();dhresult.makeRecs(xmlstring,null);uid = dhresult.getMember("uid");dhresult.nextRecord();rec = dhresult.getCurRecord();
<script language="JavaScript" src="dhAccessAPIOBJ.js"></script><script language="JavaScript" src="dhResultOBJ.js"></script><script language="JavaScript">// create connection object and connectconpar = new Object(); conpar.hostname = "localhost";conpar.port = "80";myConnection = new dHubConnection(conpar);myConnection.connect("whatever") ;scriptName=“ztest.ssql”; args=“bovine feces”;// execute dh server side script, results as XMLxmlstring = myConnection.executeScriptXML(scriptName, args);
// make HTML out of the XML for display.htmlstring = myConnection.formatXML(xmlstring);
// pick some things out of the resultsvar dhresult = new dhResultXMLProcessor();dhresult.makeRecs(xmlstring,null);while (dhresult.nextRecord())
{uid = dhresult.getMember("uid");title = dhresult.getMember("title"); acc = dhresult.getMember("accession"); org = dhresult.getMember("organism"); taxon = dhresult.getMember("taxon"); doSomethingWith(uid,title,acc,org,taxon);
}
Simple JavaScript Example
Creating Creating SpotfireSpotfire ToolsTools
?? Why Tools?Why Tools?
?? Architectural ConsiderationsArchitectural Considerations
?? Tool Development ProcessTool Development Process
Why a Tool?Why a Tool?
?? FlexibilityFlexibilityInteraction with user and Interaction with user and discoveryHubdiscoveryHub
Does more than just Does more than just ““suck in datasuck in data””?? Present structured view of everything, even bits that donPresent structured view of everything, even bits that don’’t t
fit a fit a ““ flatflat”” data modeldata model
?? Flatten the parts to insert into Flatten the parts to insert into SpotfireSpotfire
Architecture ConsiderationsArchitecture Considerations
Choose the right toolChoose the right toolAppropriate selection of implementation domain reduces complexitAppropriate selection of implementation domain reduces complexity.y.
?? SpotfireSpotfire –– User facing interactionUser facing interaction
?? JavaScript JavaScript Interaction with Interaction with SpotfireSpotfire data set, user inputsdata set, user inputs
?? discoveryHubdiscoveryHub: data integration and : data integration and transformationtransformation
Integrate and create user views and Integrate and create user views and ““ flatflat”” views for views for SpotfireSpotfire..
Tool Development Process: Tool Development Process: Components of a Components of a discoveryHubdiscoveryHub ToolTool
Spotfire Decision Site Browser
Dialog Tool
discoveryHub scriptsSQL
Client PC
discoveryHubserver
Dialog Tool ComponentsDialog Tool Components
?? Dialog runs in the DS Client Browser.Dialog runs in the DS Client Browser.DHTML/JavaScriptDHTML/JavaScript
?? PartsPartsDialog front end (user input)Dialog front end (user input)Extract from SF datasetExtract from SF datasetExecute Execute dHdH integration operationintegration operationHandle ResultsHandle Results
?? Display resultsDisplay results?? Add/Modify to SF datasetAdd/Modify to SF dataset
discoveryHubdiscoveryHub ServerServer--sideside
?? Create the integrating queryCreate the integrating queryDo the Do the ““hard parthard part”” of access to the worldof access to the world
Transform and flatten as requiredTransform and flatten as required?? Pull into a view that is easy to deal withPull into a view that is easy to deal with
?? Can create multiple views Can create multiple views
Example Tool: Example Tool: Gene AnnotationGene Annotation
?? Start with NCI geneStart with NCI gene--drug interaction datadrug interaction data
?? Keys of ID embedded in Keys of ID embedded in ““NameName”” fieldfield
?? Integrates information from several sources Integrates information from several sources ““ livelive””
?? Create simplified Create simplified ““ flatflat”” view for view for SpotfireSpotfire from from complex objectscomplex objects
?? Creates (or updates) new columns in Creates (or updates) new columns in SpotfireSpotfire
discoveryHubdiscoveryHub Server Side ScriptServer Side Script
! Example: tom2.ssql ! Example: tom2.ssql ! Extract data from NCBI ! Extract data from NCBI UnigeneUnigene and and LocusLinkLocusLink..! start with accession, go to ! start with accession, go to unigeneunigene to get a locus id.to get a locus id.! go to locus link with that. Collect bits of information along ! go to locus link with that. Collect bits of information along ! the way.! the way.! ! ------------------------------------------------------------------------------------------------------! ! discoveryHubdiscoveryHub example B. example B. RolfeRolfe, geneticXchange Inc., geneticXchange Inc.! Sample code provided for education and evaluation. ! Sample code provided for education and evaluation. ! Use at your own responsibility.!! Use at your own responsibility.!
set echo off;set echo off;! ! utils.ssqlutils.ssql contains the contains the getArgByNamegetArgByName definition.definition.usessqlscriptusessqlscript ""utils.ssqlutils.ssql";";
! Get argument values. The accession list is a quoted! Get argument values. The accession list is a quoted! string as passed in, so we use string! string as passed in, so we use string--tokenize to break tokenize to break ! on ! on whitespacewhitespace, resulting in a list of accessions : we cast , resulting in a list of accessions : we cast ! this to a set (l2s) for use in ! this to a set (l2s) for use in ! the select that follows.! the select that follows.create view create view astrastr as getArgByNum(1);as getArgByNum(1);create view create view acclistacclist as l2s(stringas l2s(string--tokenize(" ",astr,0));tokenize(" ",astr,0));
! ! full_viewfull_view is the desired result; We get a is the desired result; We get a UnigeneUnigene ID for ID for each accession,each accession,! and pull locus! and pull locus--id by the id by the UnigeneUnigene IDs and finally go to locusIDs and finally go to locus--link.link.create view create view full_viewfull_view asasselect (select (#accession: acc,#accession: acc,##unigeneunigene: : ugid.fullugid.full--id,id,##locusidlocusid: : llidllid,,##chromoposchromopos: : ll.cytogeneticll.cytogenetic,,##keggkegg: : ll.kegg.pathwayll.kegg.pathway,,##locuslinklocuslink: : llll))
fromfromacclistacclist as acc,as acc,webunigenewebunigene--idid--general(accgeneral(acc) as ) as ugidugid,,getget--locuslocus--fromfrom--unigene(#unigene(#org: org: ugid.orgugid.org, #cid: , #cid: ugid.cidugid.cid) ) llidllid,,locuslinklocuslink--byby--locusidlocusid--2(num2(num--stringify(llid)) as stringify(llid)) as llll;;
full_viewfull_view;;
Example Tool: Example Tool: Transcription AnalysisTranscription Analysis
?? Uses annotated data (Locus Link ID)Uses annotated data (Locus Link ID)
?? Retrieves set of transcription factors by geneRetrieves set of transcription factors by gene
?? Retrieves set of genes by transcription factorRetrieves set of genes by transcription factor
?? Integrates multiple web queries to Integrates multiple web queries to oPossumoPossum::http://http://sonoma.cmmt.ubc.ca/cgisonoma.cmmt.ubc.ca/cgi--bin/POSSUM/possum/bin/POSSUM/possum/
?? Transforms and presents simplified viewTransforms and presents simplified view
discoveryHubdiscoveryHub Server Side ScriptServer Side Script! Gene search of ! Gene search of oPossumoPossum for transcription factor analysis.for transcription factor analysis.! Used by ! Used by spotfirespotfire tool TZ2tool TZ2--ORTFAORTFAusessqlscriptusessqlscript ""utils.ssqlutils.ssql"; "; usessqlscriptusessqlscript "fisher2a.ssql"; "fisher2a.ssql"; usessqlscriptusessqlscript ""tf.ssqltf.ssql";";
! Input to the script ! Input to the script –– search parameterssearch parameterscreate view species as getArgByNum(1);create view species as getArgByNum(1);create view create view idtypeidtype as getArgByNum(2);as getArgByNum(2);create view phylum as getArgByNum(3);create view phylum as getArgByNum(3);create view create view idstridstr as getArgByNum(4);as getArgByNum(4);
! ! oPossumoPossum transcription factor search using gene IDs and search transcription factor search using gene IDs and search paramsparams inputinputcreate view a as fisher2a(#species:species, #create view a as fisher2a(#species:species, #idtype:idtypeidtype:idtype, #, #phylum:phylumphylum:phylum, #, #ids:idstrids:idstr););
! Extract gene lists by transcription factor for a Transcription! Extract gene lists by transcription factor for a Transcription factorfactorcreate view v1 as select setcreate view v1 as select set--head(tf(x.TargetGeneHitsURLhead(tf(x.TargetGeneHitsURL)) from a x;)) from a x;
! Function we use to simplify select statement! Function we use to simplify select statementcreate function f1 create function f1 ssss as select as select z.GeneIDz.GeneID from from ssss z;z;
! Final view of Results of Search: genes associated with each TF! Final view of Results of Search: genes associated with each TFselect (#TF: select (#TF: x.#TranscriptionFactorx.#TranscriptionFactor, #, #gidsgids: f1(x.Genes)) from v1 x; : f1(x.Genes)) from v1 x;
Example Tool: Example Tool: Column SortingColumn Sorting
?? Purpose: Enhance Heat Map viewPurpose: Enhance Heat Map viewAutomate a tedious point and click processAutomate a tedious point and click process
Accelerates finding focus on Accelerates finding focus on ““ interestinginteresting”” datadata
?? Rearranges Heat Map ColumnsRearranges Heat Map ColumnsFilters by column sparsenessFilters by column sparseness
Sorts order columns appearSorts order columns appear
10/27/2004 discoveryHub and Spotfore 41
Then the column order is changed, sorted by column name or a part of it as selected.
10/27/2004 discoveryHub and Spotfore 44
He now runs an external, proprietary application to cluster via a tool
10/27/2004 discoveryHub and Spotfore 47
He used the query device to see compounds clustered around cluster 6 with 0.5 score
10/27/2004 discoveryHub and Spotfore 48
…the next tool does another external process via discoveryHub
10/27/2004 discoveryHub and Spotfore 56
• Start with NCI drug-gene interaction data• Annotate data from multiple sources
– genBank, locus link, Unigene, etc from NCBI– Kegg pathway
• Transcription Analysis – Uses oPossum live (multiple pages)
• More details
10/27/2004 discoveryHub and Spotfore 57
• Idea of sequential workflow via tools– Automate the tedious parts– Insert domain expertise at the right points
10/27/2004 discoveryHub and Spotfore 63New Columns. “accession” and “locusid” we use in subsequent tools.
10/27/2004 discoveryHub and Spotfore 64Locus-link tool – get complete locus link record and display.
10/27/2004 discoveryHub and Spotfore 71
• Pulled in multiple external sources automatically• Insert domain expertise at the right points• Reduce “cut and paste” time exponentially• Enable workflows to tedious to do manually• Open up the world from inside Decision Site
Recommended