Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction System architecture Implementation…

Preview:

DESCRIPTION

 How can we know system information from many nodes? ◦ It is hard to track which node has a problem when too many nodes exist  But… DFS & Map/Reduce make it easy! ◦ Analyze system information using Map/Reduce ◦ A kind of network managing system like HP

Citation preview

P2P Control System based on Map/Reduce

Youngil KimAwalin Sopan

Sonia Ng Zeng

Introduction System architecture Implementation – HDFS Implementation – System Analysis

◦ System Information Logger (SIL)◦ System Information Gatherer (SIG)◦ Map/Reduce

Implementation – Visualization Implementation – P2P Application Demo

Outline

How can we know system information from many nodes?◦ It is hard to track which node has a problem when

too many nodes exist But… DFS & Map/Reduce make it easy!

◦ Analyze system information using Map/Reduce◦ A kind of network managing system like HP

Introduction

System Architecture

System Info Gatherer

(Hadoop Master)

Hadoop Slave Node

Slave

Slave Slave

HDFS

SystemManager

(Visualization)

p2p Local

P2P app.

p2p Local

P2P app.

p2p Local

P2P app.

p2p Local

P2P app.

Sys Info Logger

Sys InfoLogger

Sys Info Logger

Sys Info Logger

SystemControlNetwork

P2PNetwork

SystemInformation

Hadoop for DFS & Map/Reduce Framework◦ Master: brood00◦ Slaves: Currently tested with 5 nodes

(bug51 ~ bug55)◦ Using each local storage (not using home

directory)◦ Network Ports: hdfs(9000), job tracker(9001),

Namenode Interface (50070), JobTracker Interface (50030)

Implementation - HDFS

Implementation - System Analysis

mr_syslog.py◦ Implemented in Python◦ Save information in both local storage and HDFS◦ Gather information about every 10 secs◦ Create logfile based on time

Information of each node is saved with the following format◦ < 20110501_2252_bug51.log >◦ bug51 1304304720: mem(75.50), cpu(1.00), disk(10.00)◦ bug51 1304304724: mem(75.50), cpu(1.50), disk(10.00)◦ bug51 1304304727: mem(75.51), cpu(0.40), disk(10.00)◦ bug51 1304304729: mem(75.51), cpu(0.50), disk(10.00)◦ bug51 1304304732: mem(75.50), cpu(0.50), disk(10.00)◦ bug51 1304304734: mem(75.50), cpu(0.40), disk(10.00)

System Information Logger (SIL)

Functions◦ Find current resource usage of each node at

current time using Map/Reduce Currently, it shows maximum values per minute time

slot◦ Communication Gateway between nodes and

visualization tool Send “QUERY” to each P2P application Send node status to visualization tool (node ID,

(in)active, CPU usage, memory usage, storage)

System Information Gatherer (SIG)

Map:◦ Input – each node log file

Key: position of file Value: raw data, one line per key

◦ Output Key: node ID Value: set of system information

(CPU/memory/storage usage) Eg: < bug51, [30.0, 29.0, 12.0] >

Map/Reduce

Reduce:◦ Input – from Map

Key: node ID Value: set of set of system information Eg: < bug51, [ [30.0, 29.0, 12.0], [33.0, 40.0, 9.0], …

] >◦ Output

Key: Node ID Value: Maximum values for each piece of information Eg: < bug51, [33.0, 40.0, 12.0] >

Map/Reduce

Implementation - Visualization

Not a real application to use◦ Just to show how to control application or system

on each node using visualization◦ Only has STOP/RESUME operation

Functions◦ Response to “QUERY” Show active/inactive◦ Response to “CONTROL” Change status based

on control argument

Implementation – P2P Application

System set-up and initialization (video file) Show namenode & jobtracker interface

Show Map/Reduce jobs Show Visualization tool

◦ Changes of each status◦ Control each P2P application

Demo

Recommended