View
216
Download
0
Category
Tags:
Preview:
Citation preview
Computing Research and Educationat The University of Pittsburgh
Physics Econ
Math Bio.Chem.
Social sciences.
Pub.Health.
Eng.Medicalschool.
The Pittsburgh Supercomputing Center
School of Information Science
Department of Computer Science
(Arts & Sciences)
Department of Electrical and Computer Engineering
(School of Engineering)
Department of Biomedical Informatics
Department of Computational
Biology
The School of Business
Telecommunication Program (G)
Inter-disciplinary programs
Computer Engineering
Program (G+UG)
Computational Biology Program
(G)
Bioinformatics Program (UG)
Department of Computer Science
Scientific Computing
Program (UG)Math.
SIS
Biology
Med. School + CMU
ECE
Intelligent Systems Program (G)
DBMI, SIS, Psychology, …
Computational Modeling and Simulation (G)
Department of Computer Science
Data Management
Algorithms
Artificial Intelligence
Networks
Architecture and Compilers
Security
Distributed and Parallel Systems
Graphics
21 T/S faculty(now 19)
+ 4 Lecturers
Real-Time Systems
Artificial Intelligence
Rebecca HwaMilos Hauskrecht
Diane LitmanJanyce Wiebe
Machine learningNatural language
processingData mining
Real world data:• high dimensional (thousands of variables)• time series • imperfect (missing data)
Tasks : (1) Identify important patterns in data • Use them to support a variety of prediction and discovery tasks• Identification of relations/dependencies among variables(2) Identify unusual patterns in data
– Outlier detection
Analysis of high dimensional datasets
Data mining and machine leaning
Biomedical and bioinformatics applications
Data: • Clinical data: thousands of labs,
measurements in time• Bioinformatics data: DNA and proteomic
arrays, Mass spectrometry data, SNP (single nucleotide polymorphism) arrays
Tasks: • Disease detection (e.g. cancer screening)• Predict therapy response • Detection of unusual patterns• Patient-monitoring and alerting• Identification of relations among
diseases/variables
0 1000 2000 3000 4000 5000 6000 7000 80000
10
20
30
40
50
60
70
80
90
100
m/z
inte
nsity
Traffic applications
Data: • Data from sensors placed on highways,
roads• Infrastructure data (maps)
Tasks: • Probabilistic models of the traffic
system• Traffic prediction• Route optimization
NLP + ML for Monitoring Acute Lower Respiratory Syndrome
EmergencyDepartment
Reports
NLP Modules
LocateInstances of
55 conditions
Assignvalues to contextual
features
ALRSClassifier
Determine valuesFor 55 conditions
NLP and machine leaning
Subjectivity Analysis: opinions, emotions, motivations, speculations, sentiments• Information Extraction of
– NL expressions
– Components
– Properties
Angolans are terrified of the Marburg virus
Source Attitude Target
Negative EmotionIntensity: High
Opinion FrameSource: AngolansPolarity: negative Attitude: emotionIntensity: highTarget: Marburg virus. . .
Natural Language processing
Fine-grained OpinionsAustralian press has launched a bitter attack on Italy after seeing their beloved Socceroos eliminated on a controversial late penalty. Italian coach Lippi has also been blasted for his comments after the game.
In the opposite camp Lippi is preparing his side for the upcoming game with Ukraine. He hailed 10-man Italy's determination to beat Australia and said the penalty was rightly given.
Opinion FrameSource: Australian PressPolarity: negative Attitude: sentimentIntensity: highTarget: Italy. . .
Australian Press
ItalyMarcello Lippi
penalty
Socceroos
Extraction and Summarization of Opinions
• Provide technology that can aid analysts in their– extracting socio-behavioral information from text– monitoring public health awareness, knowledge and
speculations about disease outbreaks, …
• Enrich Information Extraction, Question Answering, and Visualization tools
Spoken Dialog Systems
• Systems that interact with users via speech• Provide automated telephone or microphone access to a back-end• Advantages: naturalness, efficiency, eyes and hands free
user
Speech Recognition
TTS or recording
DB, web,system
Spoken Dialog System
TTS= text-to-Speech
Natural Language processing
Intelligent Tutoring Systems
• Education– Classroom instruction [most frequent form]– Human (one-on-one) tutoring [most effective form]
• Computer tutors – Intelligent Tutoring Systems– Not as good as human tutors– Ways to address the performance gap
• (Spoken) dialog systems• Affective (dialog) systems (exploit user’s emotion – affection)• Respond to both what a user says, and how it is said
• Evaluation and Automatic System Optimization– How can we tell if we are improving a system?– Can systems be tested with simulated rather than real users?– Can a system learn to optimize behavior based on prior data?
Intersection of two fields • Spoken Dialog Systems• Intelligent Tutoring Systems
Data Management
Panos ChrysanthisAlex Labrinidis
Mobile data management
Web and real-time data management
Stream data management
Scientific data management
M1
Q1 Q2
1 1
M2
2 2
33
4 5
Oy
Oz
Ox
Ol
Operator Segment Ex
Q3
Or
Shared Operators
Data Acquisition
Data Stream Processing
Web Data Management
Data Dissemination
Mobile data management (sensor databases)
Energy-efficiency in-network aggregation for continuous (monitoring) queries
• Hierarchical output filters that reduce energy consumption while bounding loss in aggregate data quality
• Support views that maintain in-network Top-k Views
• Cross-layer optimization for collision-avoidance
• Multi-criteria routing for sensor networks to prolong lifetime and improve quality of data
Self-adapting data routing to meet user specified QoS and QoD requirements based on machine learning
• Efficient data acquisition with mobile sensors
Data acquisition is scheduled at perimeter sensors and storage at core nodes as spatiotemporal aggregates
Energy-efficiency in-network storage and processing of queries in ad-hoc networks
• Load balancing of storage and query hot-spots in Data-Centric Storage schemes
Zone sharing, data replication and dynamic restructuring the reading to sensor mappings
• Similarity-aware query processing in sensor networks with data centric storage
Utilizing recent query materializations in the form of sensor views
Data stream management systems
• Alerting/Monitoring Service – Register query (filter) ahead of time– “Match” against incoming data stream– Generate “events” & notify users
• Examples: – Stock market monitoring– Transient alerts (LSST)– Google alerts– Detection of outbreak of diseases
• Objective: Policies for scheduling the execution of multiple continuous queries and load shedding which improve the freshness and performance of a DSMS (response time, processing rate, fairness)
• Solution Characteristics :• Efficient implementation• Scheduling join operators • Exploits shared operators
M1
Q1 Q2
1 1
M2
2 2
33
4 5
Oy
Oz
Ox
Ol
Operator Segment Ex
Q3
Or
Shared Operators
User centric web data management
4:26 AM ET Given an option, would you prefer slightly-stale results fast OR fresh results, slightly delayed?
• Users care about Timeliness and Staleness• Combining performance metrics
Set constraint on one and optimize the other Construct a single metric based on weights
• Scheduling policies (FIFO, update high, query high) do not optimize both metrics
• Proposed Contracts Framework converts performance on individual metric into “worth” to users combining • quality of service and • quality of data
Staleness (#uu)
Res
po
nse
Tim
e (m
s) FIFO-UH[11591,0]
FIFO[322,0.07] FIFO-QH
[23,0.26]
Biological data managementCenter for Modeling Pulmonary Immunity
+ =• NIH-funded (2005 - 2009)• Builds mathematical models and a data exchange server to
– Record all experimental information– Enable sharing & interoperability across centers– Presents a User-Centric View-Based Annotation Framework for Data– Allow continuous Scientific Workflows
Modeling, visualizationAnd
Computer Graphics
Elisabeta Marai
Visual mining
Exploratory visualization
Physically-based modeling
Interactive tools
Image Processing, Modeling & Simulation of Biological Structures
w/ UPMC Orthopedics
Medical measurements (images, motion, forces etc)Incomplete data (half of parameters not measurable)Uncertainty associated w/ input dataMultiple sources of data (e.g., literature reports) Predictive models
and simulations
23
Exploratory Visualization and Analysis
Biomedical anomaly detection (exploratory visualization)
Collaborative analysis of defective machine translations
Computer Networks
Adam LeeRami MelhemDaniel MosseTaieb Znati
Network Protocols
Wireless Networks
SecurityPower
Management
=
Determination of access rights
Ex: Disaster response, supply chain management, p2p, grid computing, the Web…
Trust management systems seek to address this problem– Declarative access control policies– Cryptographic credentials– Runtime proof construction techniques to make authorization decisions
Current research:– Flexible access control and usable policy management– Decentralized knowledge management in adversarial environments
+Policy ✔
Security and Trust: Adam Lee
Flexible proof-based authorization
NO!
Project 1: Approximate proofs and risk-based analysis
Project 2: Subjective metrics
OK!
Management of distributed knowledge in adversarial settings?
Common knowledge?Distributed knowledge?
Applicationrequirements
Proof systemfeatures What is provable?
Protocols, algorithms,and cryptography
Th
eory
Pra
cti
ce
Applications: information flow in hospitals, sensor as logical db, pervasive, social networks and gossip, etc…
Collaboration with GSPIA
Secure Critical Information Infrastructure
• System does data gathering, and provides suggestions to Emergency Managers
• System does NOT act by itself, unless there is no one at helm
• A system that provides a lot of information to the Emergency Managers, who actually coordinate emergency responses
• Need to be secure, otherwise cannot be used widely
• It is critical since once it it will be depended upon• EMs, utility companies, everyone must collaborate.
Computer Architecture and Compilers
Donald ChiarulliBruce ChildersSangyeun ChoRami MelhemYoutao Zhang
Chip design
Memory and Cache Systems
InterconnectionsPower
management
Low Power Terabyte Main Memory using Phase Change Memory
• Problem: Increased main memory demand– New apps w/CMPs need terabyte+– DRAM: High power consumption, problematic
organization, reliability (SEUs)
• Solution: Replace DRAM with PCM– No idle power (solid state), eases organization, no
SEUs– Slower (3x), asymmetric read/write latency, wears
out quickly– Performance management, write minimization,
wear leveling
Transient Bookkeeping
Data
Memory Manager
Page Allocation
Write Minimization
Wear-leveling
Usage Monitor
Failure Detection
DRAM Controller
Acceleration & Endurance Buffer (AEB)
Implemented as DRAM
PCM Controller
Terabyte Main Memory (TMM)
Implemented as PCM
Persistent Bookkeeping Data
Allocated Pages
Processor caches
Innovative Memory Technology
Yield & Reliability Enhancement for On-Chip Multicore Memories in Nano-scale Technology
• Problem: Increase in defects & process variation for CMPs– Worst-case design: infeasible due to tighter margins– On-processor memory components: highly susceptible
• Solution: “Soft yield” trades performance for better chip yield– Test & plan for repairs during manufacturing– Deployment adapts microarchitecture to gracefully degrade– Collaborator: Sangyeun
T-CAR: Test and Continuous Adaptive Repair
Unrepaired Chip
Test, Repair & Binning
Profile-driven testing
Repair Planning
Fault map<Fmax>
Decrease Fmax Profile new <V,F,T>
Failed: Discard chip
Resource, Workload Models
Incorporate Repair Plans
Monitor &Repair
PRIOR TO CHIP DEPLOYMENT DEPLOYED
Fault map<V,F,T>Fault map
<V,F,T>Fault map<V,F,T>
ConditionRepairsCondition
RepairsConditionRepairs
RepairedChip
On-chip Memory
Robust Execution Environment for Multicore Systems
• Problem: Increasing on-chip run-time variability in CMPs• Solution: React and adapt to the variability as it happens
– Thermal hotspots, power consumption, wear-out/failure– Dynamic thread compilation, specialization, & scheduling– Collaborators: Kandemir & Irwin (PSU), Davidson & Soffa (UVA)
Number of Threads
Nu
mbe
r o
f Co
res
1614118
8
9
11
14
16
(16,14)
Thread Migration
(16,9)Re-threading +Thread Migration
(11,11)
Thread Migration + Re-threading + Voltage Scaling
(14,14)
(16,16) Two PEsgo down
20% reduction52% reduction
56% reduction
Number of Threads
Nu
mbe
r o
f Co
res
1614118
8
9
11
14
16
(16,14)
Thread Migration
(16,9)Re-threading +Thread Migration
(11,11)
Thread Migration + Re-threading + Voltage Scaling
(14,14)
(16,16) Two PEsgo down
20% reduction52% reduction
56% reduction
Number of ThreadsN
umbe
r of
Cor
es
1611108
8
9
11
13
16
(16,10)
(11,9)
(10,10)
(10,13)
(16,16)
(11,11)
(8,8)
(8,14)
13
(13,13)
Number of ThreadsN
umbe
r of
Cor
es
1611108
8
9
11
13
16
(16,10)
(11,9)
(10,10)
(10,13)
(16,16)
(11,11)
(8,8)
(8,14)
13
(13,13)
Graphs courtesy of Mahmut Kandemir
Thermal map of Cell Possible adaptations Continuous adaptation
Managing Multicores
3D Lab-on-Chip for Separation, Purification, and Assay of Nanoscale Bio-Particles in Mixtures
Donald. Chiarulli, Computer Science, Steve Levitan. ECE,
Fred Homa, School of Medicine
Optical Detection and Assay TechnologyThis is the only device capable of non-destructive detection and assay of bio-particles smaller than the diffraction limit of visible microscopy.
OverviewWe exploit the fact that the polysilicon layer is very close to the top surface of the device to create very large and dense electrode arrays for Multiple Frequency Dielectrophoresis (MFDEP) This is only possible by using the upside-down configuration of tier 1 chip in the MIT-LL 3D process.
MFDEP is a new technique that allows selective manipulation of specific biological particles in mixtures. Each particle experiences a different force magnitude and/or direction based on the field frequency in the region and the electrical properties of the particle.
In 3D integration technology we can build electrode arrays in the polysilicon of tier one, just 600nm from the MFDEP chamber. These electrode arrays are 10-20X smaller and denser, and 100-1000X larger than the closest comparable lab-on-chip technology.
Outcome: Much higher sensitivity, large mixture fractionation, and low power, in a tightly integrated implementation.
A revolutionary design that supports direct separation, isolation and population measurements of specific cell types, viruses and biological macromolecules
System ArchitectureTier 1:• Dense electrode array in poly• Micro-fluidic trench in overglass
Tier 2:• Analog Switch array• Individually addresses each electrode with externally driven or internal synthetic waveform
Tier 3:• Digital control logic• Supports complex spatial and temporal waveform sequences
Features Enabled by 3D integration Technology
Ultra sensitive MFDEP. 70 nanoSiemens/Hz.conductivity difference
Lower Device Operating PowerHigh electrode density = lower fieldvoltages
Multiple fractionsLarge array for complex mixture fractionation
Detection and Assay Capable:Detection and assay with low cost (CD-type) micro-
optical readout
34
Real-time systems
High performance computing
Scheduling Fault-tolerance Real-time Control
Discrete event simulations
Load-balancing
Particle-particle simulations
Automatic parallelization
Grid based simulations
Recommended