36
Introduction Computer Science Henri Bal Vrije Universiteit Amsterdam

Introduction Computer Sciencebal/InlInformatica/InleidingCS-Bal-2017.pdf · Computer Science (CS) CS sits between technology and applications, both of which have turbulent developments

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Introduction Computer Science

Henri Bal

Vrije Universiteit Amsterdam

Goals of this course

● Understand typical Computer Science topics

● Meet with students and some staff members

● Develop skills:

● Reading (English) scientific literature

● Critical/analytical thinking about CS topics

● Discussing

● Presenting

Structure

● Tuesdays: guest lectures

● 3-4 scientific papers provided as context

● Questions made up beforehand

● Working groups

● 3-4 students per group present a paper

● Each group discusses papers + questions

Topics (Tuesday lectures)

● Intro & high-performance computing (Henri Bal)

● Literature search (Michel Klein)

● Watson (Chris Welty, with LI & IMM students)

● e-Health (Aart van Halteren)

● e-Science infrastructures (Cees de Laat)

● Luggage handling at Heathrow Terminal 5 (Huub van der Wouden, with IMM students)

● Bioinformatics (Jaap Heringa)

Working Groups● Supervised by staff members (instructors)

● First meeting:

● Instructors give introduction and discuss the schedule

● See instructions on Canvas how to indicate your preferences before Wednesday 23:59

● Other meetings:

● Students present/discuss papers

● Course material + group composition are on Canvas

● Room number is on rooster.vu.nl (given your instructor)

● For problems, contact George Karlos [email protected]

● Pay attention to announcements on Canvas

Your tasks● Attend Tuesday lectures

● Send brief answers to questions before workgroup deadline

● Give 1 presentation in a working group

● Make slides, give short presentation

● Participate in working group discussions

● Grading: pass or fail (no numerical grade)

● Participation & presentation must be “sufficient”

Relevance to the CS program

● Illustrates what academic research in Computer Science is about

● Gives a selection of current research topics

● Many topics come back later in ourBachelor or Master programs:

● eScience Infrastructure → Networking course

● eHealth → Ba Lifestyle Informatics

● HPC → MSc track Parallel Computing Systems

● Watson → Artificial Intelligence courses + MSc

● Bioinformatics → Bioinformatics MSc program

First presentation

● My personal view on Computer Science

● Why is Computer Science so interesting?

● Biased towards my own research area:

● High performance distributed computing

Computer Science (CS)

● CS sits between technology and applications, both of which have turbulent developments

● Processors, networks, mobiles, wearables, …

● Data explosion in virtually all applications

● CS also studies many fundamental problems of its own

● Programming languages, security, AI, theory ….

Outline

● Technology

● Computers

● Some history

● High performance computers

● Modern (multicore) PCs

● Networks & mobile computing

● Applications

● Data explosion

● Computation demands

● Fundamental CS questions

Computers

● Mainframe: powerful centralized computer

● IBM 704 (1964)

● Minicomputers: <25K$, for small groups

● PDP-8, PDP-11, VAX (1960s-1980s)

● Workstations: expensive personalgraphical machine

● Xerox Alto (1973)

● PCs: inexpensive machine for the masses

● IBM PC (1981)

High Performance Computers

● Computer systems with many processors, all computing in parallel

High Performance Computers (1)

● Vector machines

● Can do vector operations in parallel

● A and B: 1-dimensional matrices with 100 elements

● Computing A+B (= 100 computations) takes as much time as doing 1 addition on a sequential computer

● History

● 1970s, 1980s (e.g., Cray)

● 2000s (Japanese Earth Simulator)

● 2010s (GPUs, Graphical Processing Units)

High Performance Computers (2)

● Massively parallel machines

● 1000s of special processors connected by a special network, all running in parallel, each doing part of the overall computations

● E.g., CM-1, CM-5, Intel Paragon, IBM BlueGene

● Connection network uses graph theory (math)

High Performance Computers (3)

● Cluster computers

● Parallel machines built from off-the-shelf (commodity) PCs and networks

● Excellent price/performance ratio

● Exponential performance growth ofprocessor speeds

● See http://www.top500.orgfor 500 fastest supercomputers

Multicores & Manycores

● All PCs now have >1 compute cores

● Every PC is a parallel computer!

● Some PCs already have 48 cores

● Core count will increase to hundreds

● Intel Phi (2012): 60 Pentium-1’s on 1 chip, with advanced vector support

● GPUs (Graphics Processing Units)have 1000’s of very simple cores

● Challenge: how to program these things?

Thinking in parallel is hard

● How to split up the work?

● Load balancing

● All cores should do the same amount of work

● Communication & synchronization

● Cores must exchange data (=overhead)

● Nondeterminism:

● A single processor always gives same outcome

● With >1 core the outcome may depend on the order (called a ``race condition’’ bug)

Current debates

● Should we build chips with:

● Very fast/complicated (superscalar) processors?

● Hits a ‘’power wall’’, hard to increase clock frequency

● Many slower/simpler (thin) processors?

● Hard to program

● How to deal with energy consumption?

● Performance per Watt becomes key factor

Networks

● Wide area networks (WANs)

● Local area networks (LANs)

● Mobile networks

● Much more inComputer Networks class

Wide area networks

● ARPANET

● First computer network, connecting some US sites (1960s)

● Speeds measured in kbit/s

● Internet

● Based on standardized (IP) protocol suite

● Connect everyone/everything (Internet-of-things)

● Dedicated optical networks (light paths)

● 10 gbit/s, point-to-point

Local Area Networks

● Ethernet: developed by Xerox PARC (1974)

● Speed increased from 10 mbit/s to 100 gbit/s

● Cluster computers use Ethernet or faster commodity networks

● Myrinet

● Infiniband

An aside

● In Computer Science

● k(ilo)=1024

● m(ega)=10242

● g(iga)=10243

● t(era)=10244

● p(eta)=10245

● e(xa)=10246

● All has to do withbinary numbers

DAS-5

Dual 8-core Intel E5-2630v3 CPUsFDR InfiniBand OpenFlow switchesVarious acceleratorsCentOS LinuxBright Cluster ManagerBuilt by ClusterVision

VU (68)

TU Delft (48) Leiden (24)

UvA/MultimediaN (18/31)

SURFnet7

ASTRON (9)10 Gb/s

Mobile computing

● Laptops, sensors, smartphones, tablets

● Many forms of mobile networks

● Wifi (local range), BlueTooth (for pairing devices)

● 3G, 4G (lower bandwidth, high coverage)

● Ultimately: ubiquitous computing?

● Vision by Mark Weiser (1988)

● ‘’machines that fit the human environment instead of forcing humans to enter theirs’’

● Next: Internet of Things

Networks for Internet-of-Things

● 3G/4G: too expensive for just ‘things’

● Short/medium range communication

● Wifi, ZigBee, Bluetooth, Bluetooth Low Energy

● Low-Power Wide-Area Network (LPWAN)

● LoRaWAN

● Narrowband

● (many others)

LoRaWAN

IoT gatewayLoRa nodes

Internet

Cloud

IoT pilot testbed Alkmaar (2017)

● LoRaWAN based testbed covering whole city

● Many interested application domains (safety, city analytics, air quality, waste management)

Outline

● Technology

● Computers

● Some history

● High performance computers

● Modern (multicore) PCs

● Networks & mobile computing

● Applications

● Data explosion

● Computation demands

● Fundamental CS questions

Application developments

● There is a ``data explosion’’ in many application areas

● Huge amounts of data (up to Petabytes/year)

● Very complicated/heterogeneous data

● Demand for computing

● Model (simulate) designs on a computer

Data explosion

● Society:

● Web, social networks

● Industry, economy:

● Banks, stock markets

● Science

● LHC (``Higgs particle’’)

● Data stored on world-wide ``grid’’

● Bioinformatics (next generation sequencing)

● Astronomy: software telescopes (LOFAR, SKA)

Computing demands● Computational science:

● Modeling ozone layer, climate, ocean, human brain

● Simulating galaxies

● Engineering:● Aircraft modeling, designing F1 cars

● TVs (mostly software), embedded systems

● Games and multimedia:● Computer chess (Deep Blue)

● Watson (Jeopardy)

● AlphaGo (Google)

● Analyzing multimedia content

● Digital forensics

● Generating movies

Pixar’s ``Up’’ (2009)

Whole movie (96 minutes) would take 94 years on 1 PC

(4 frames per day; 1 second takes 6 days; 1 minute per year)

Some fundamental Computer Science topics (1)

● Operating systems:

● Windows, Linux, Minix (Andy Tanenbaum)

● Programming languages and systems

● Fortran, Cobol, C, Java, Python … (thousands)

What happens if you ask a computer scientist to solve a problem?

He/she will come back 3 months later, with …

a new programming language ideally suited for solving your problem

Some fundamental Computer Science topics (2)

● Security

● Preventing/detecting attacks, privacy, etc

● (Semantic) web technology

● Finding and reasoning about content on the web

● Cloud computing

● Store data and programs remotely, in the Cloud

Some fundamental Computer Science topics (3)

● Artificial intelligence

● E.g. machine learning (deep learning)

● Databases

● Storing and searching huge amounts of data

● Logic, modelling, graph theory, complexity

● Essential for many applications

Conclusion

● Modern Computer Science deals with hectic developments in technology and applications

● Both provide us many research problems

● Application-driven vs technology-driven research

● There also are many fundamental CS problems