Upload
shana-kelly
View
212
Download
0
Embed Size (px)
Citation preview
Cyber-Infrastructure in Education
South Carolina State UniversityCyberinfrastructure Day
March 3 2011
Geoffrey [email protected]
http://www.infomall.org http://www.futuregrid.org
Director, Digital Science Center, Pervasive Technology Institute
Associate Dean for Research and Graduate Studies, School of Informatics and Computing
Indiana University Bloomington
Types of Activities• Cyberinfrastructure ranges from a web page through a
petaflop supercomputer• Research has substantial needs such as either
– Petaflop supercomputer– Ability to analyze many (upto 100 now) terabytes of data
(on a cloud)• Education needs
– Access to results of cyberinfrastructure research– Broad access to scholarly information (digital library)– Teach students about e-Science (domain science) and
Cyberinfrastructure (Computer Science)– Exploit electronic infrastructure to enhance learning
Access to results of cyberinfrastructure research
• Portals are the access points to electronic resources– For example Amazon.com is an access point to an
electronic shop• e-Science projects have a portal interface for
their scientists– Some have education components– Some interest in producing education oriented
interfaces by outsiders but no clear initiative?
Broad access to scholarly information (digital library)
• National Science Digital Library http://nsdl.org/ • There is an interesting discussion of role of University libraries
in preservation of “data produced by faculty”• Curriculum libraries such as that at MIT or HPCUniversity• Collections of articles maintained by publishers and
professional societies have problems due to charges– Role of centralized and de-centralized collections still not agreed
• Google (for example) is keen to “own all data” including digital books and even science data if can be linked to Google Earth!– But this has opposite problem of preserving Intellectual property
(seen clearly in music piracy)
• Note MapReduce perfect for analyzing such data
Teach students about e-Science (domain science) and Cyberinfrastructure (Computer Science)
• This can be quite sophisticated as in difficult parallel algorithms
• As in portals, one can leverage research investments• Does not need students to run petaflop simulations
– Should be able to capture essence of computational/science issues in smaller runs
– Appliances (see later) can be used• FutureGrid possible site• Note clouds very popular with students as many
commercial jobs in development and use companies– As well as for CS research and as vehicle for domain science
Exploit electronic infrastructure to enhance learning
• Several quite old approaches are critical and dominant– “Just a bunch of web pages” aka digital library– Video conferencing– Shared material as in Webex, Adobe Connect
• Note asynchronous interaction via Twitter, Blackboard, Google docs etc. much easier (and successful) than synchronous (Polycom, access grid, Webex) approaches
• Interactive web learning environments such as www.whyville.net• Virtual worlds such as Second Life have not taken off but some
think this will change as performance of clients and networks are improving dramatically (VRML failed ~1999)
• Must move to an environment consistent with world view of current students aka the “Twitter University”
C4 = Continuous Collaborative Computational Cloud
C4 EMERGING VISION
While the internet has changed the way we communicate and get entertainment, we need to empower the next generation of engineers and scientists with technology that enables interdisciplinary collaboration for lifelong learning.
Today, the cloud is a set of services that people explicitly have to access (from laptops, desktops, etc). In 2020 the C4 will be part of our lives, as a larger, pervasive, continuous experience. The measure of success will be how “invisible” it becomes.
C4 Education VisionC4 Education will exploit advanced means of communication, for example, “Tabatar” conference tables as clients , with real-time language translation, contextual awareness of speakers, support for people with disabilities; servers supporting collaboration between learners and teachers through “virtual worlds” generalizing Twitter Clouds with MapReduce frontends, Second Life ……
We are no prophets and can’t anticipate what exactly will work, but we expect to have high bandwidth and ubiquitous connectivity for everyone everywhere, even in rural areas (using power-efficient micro data centers the size of shoe boxes). Here the cloud will enable business, fun, destruction and creation of regimes (societies)
C4 Society Vision
C4
ContinuousCollaborative
ComputationalCloud
C4I N
T EL
IG
L
EN
CE
MotivatingIssues job / education mismatch Higher Ed rigidity Interdisciplinary work Engineering v Science, Little v. Big science
Modeling& Simulation
C(DE)SEC4 Intelligent Economy
C4 Intelligent People
C4 Intelligent Society
NSFEducate “Net Generation”Re-educate pre “Net Generation”in Science and EngineeringExploiting and developing C4
C4 Curricula, programsC4 Experiences (delivery mechanism)C4 REUs, Internships, Fellowships
Computational Thinking
Internet &Cyberinfrastructure
Higher Education 2020
CDESE is Computational and Data-enabled Science and Engineering
Educational appliances
• One component of C4
• A flexible, extensible platform for hands-on, lab-oriented education (on FutureGrid)
• Need to support appliances representing clusters of resources
• Virtual machines + social/virtual networking to create sandboxed modules– Virtual “Grid” appliances: self-contained, pre-packaged
execution environments– Group VPNs: simple management of virtual clusters by
students and educators
Why use Virtualization?
• Traditional ways of delivering hands-on training and education in parallel/distributed computing have non-trivial dependences on the environment
• Difficult to replicate same environment on different resources (e.g. HPC clusters, desktops)
• Difficult to cope with changes in the environment (e.g. software upgrades)
• Virtualization technologies remove key software dependences through a layer of indirection
Appliance Infrastructure - guiding principles
• Fidelity: activities should use full-fledged, executable software: education/training modules– Learn using the proper tools
• Reproducibility: Creators of content should be able to install, configure, and test their modules once, and be assured of the same functional behavior regardless of where the module is deployed– Incentive to invest effort in developing, testing and
documenting new modules
Appliance Infrastructure - guiding principles
• Deployability: Students and users should be able to deploy modules in a simple manner, and in a variety of resources– Reduce barriers to entry; avoid dependences upon
a particular infrastructure• Community-oriented: Modules should be
simple to share, discover, reuse, and expand– Create conditions for “viral” growth
Towards this vision in FutureGrid
• Executable modules – virtual appliances– Deployable on FutureGrid resources– Deployable on other cloud platforms, as well as
virtualized desktops• Community sharing – Web 2.0 portal,
appliance image repositories– An aggregation hub for executable modules and
documentation
Virtual appliance example• Linux, Java, Hadoop, configuration scripts
copy
instantiate
Hadoopimage
A Hadoop workerAnother Hadoop worker
Repeat…
VirtualizationLayer
Virtual cluster appliances• Virtual appliance + virtual network
copy
instantiate
Hadoop+
VirtualNetwork A Hadoop worker Another Hadoop worker
Repeat…
Virtual machine
Virtual network