Upload
deepak-singh
View
4.346
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Talk at Virginia Tech, Nov 14, 2008
Citation preview
Deepak Singh, Ph.D.
Picture via Eole under a CC-BY-NC-SA license
?
Via Reavel under a CC-BY-NC-ND license
biz dev manager
resizable compute capacity
scalable web sites
number crunching
but
life science industry
software
informatics
scientific programmer
product manager
strategist
opinions
lots of opinions
career choices
software development
informatics
computing
data
open data
http://mndoci.com
http://c2cbio.com or on iTunes
http://bioscreencast.com
By jasarcadia under a CC-BY-NC-ND license
A meme (pronounced /miːm/) consists of any idea or behavior that can pass from one person to another by learning or imitation
big data
collective intelligence
the new science
By ~Prescott under a CC-BY-NC license
datasets
many datasets
PFAM
GENBANK ENSEMBL
PDB
Many Others
manageable
download
data management is not
data storage
smart
context
Via Nature Reviews Cancer
technology
technology?
?
?
?
technology
technology
technologytechnology
Back of the room
listening
toxicologists
experiment design
holistic
systems biology
the s*&t hits the fan
Image courtesy Matt Wood
genome #1
$3 billion
15 years
1000 genomes
http://www.1000genomes.org/
By bitterlysweet under a CC-BY-NC-ND license
75 TB / week
600 GB – 6 TB / run
200 TB drive
schema
fit on a walltoo big to
implications
Via Barack Obama under a CC-BY-NC-SA license
utilization
capacity planning
data availability
data access
collaboration
computation
typical informatics workflow
distribute everything
distributed data
distributed computing
Via bionicteaching under a CC-BY license
Via bionicteaching under a CC-BY license
services everywhere
data services
application services
api
available everywhere
available all the time
sensors
adverse event reporting
research streaming
computing everywhere
Via Laughing Squid under a CC-BY-NC-ND license
collective intelligence is a shared or group intelligence that emerges from the collaboration and competition of many individuals.
networked future of science
protective
A biologist would rather share their toothbrush than their (gene) names
-- Mike Ashburner (Cambridge)
wisdom
look elsewhere
Wherever you work most of the smart people are somewhere else
-- Bill Joy
TIMTOWDI
data
data finds the data,then people find
people
Source: Jon Udell
important
world wide web
giant global graph
search
traverse link graph
people
present
future
data in context
linked data
the artist formerly known as
the semantic web
entity extraction
follow the graph
let the data find the data
and then
people will find the people
information overload
filter failure
human trust networks
many ways
scientific social networks
why?
put people first
communities around data
http://ecolicommunity.org
http://ebird.org/content/ebird/
micro-communities
little segue
“Bursty Work”
loosely distributed collaborations
computational problems
back on track
I define Web 2.0 as the design of systems that harness network effects to get better the more people use them, or more colloquially, as “harnessing collective intelligence.” This includes explicit network-enabled collaboration, to be sure, but it should encompass every way that people connected to a network create synergistic effects
-- Tim O’Reilly
web as platform
data driven platform
people driven platform
bayesian filter
find relevant information
huge amounts of data
architect for innovation
google visualization api
structured data
multiple sources
connected to the web
platform
create
share
re-use
visualizations
create
share
re-use
create
share
re-use
mashups
collect
analyze
remix
repurpose
only way
open data
obey web standards
xml
json
rdf
all this stuff
new models
research
collaboration
business
exciting times
Via The Opportunity Agenda under a CC-BY-NC-SA license
the door is open
take the step
Acknowledgements
Matt WoodCarole GobleLarry LessigThe Biogang