An Approach for Profiling Distributed Applications Through Network Traffic Analysis

Pós-Graduação em Ciência da Computação

“An Approach for Profiling Distributed Applications Through Network Traffic Analysis”

Por

THIAGO PEREIRA DE BRITO VIEIRA

Dissertação de Mestrado

Universidade Federal de [email protected]

www.cin.ufpe.br/~posgraduacao

RECIFE, MARÇO/2013

UNIVERSIDADE FEDERAL DE PERNAMBUCO

CENTRO DE INFORMÁTICA

PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO

THIAGO PEREIRA DE BRITO VIEIRA

“AN APPROACH FOR PROFILING DISTRIBUTED APPLICATIONS THROUGH NETWORK TRAFFIC

ANALYSIS"

ESTE TRABALHO FOI APRESENTADO À PÓS-GRADUAÇÃO EMCIÊNCIA DA COMPUTAÇÃO DO CENTRO DE INFORMÁTICA DAUNIVERSIDADE FEDERAL DE PERNAMBUCO COMO REQUISITOPARCIAL PARA OBTENÇÃO DO GRAU DE MESTRE EM CIÊNCIA DA COMPUTAÇÃO.

ORIENTADOR: Vinicius Cardoso Garcia CÓ-ORIENTADOR: Stenio Flavio de Lacerda Fernandes

RECIFE, MARÇO/2013

Catalogação na fonte

Bibliotecária Jane Souto Maior, CRB4-571

Vieira, Thiago Pereira de Brito An approach for profiling distributed applications through network traffic analysis. / Thiago Pereira de Brito Vieira. - Recife: O Autor, 2013. xv, 71 folhas: fig., tab. Orientador: Vinicius Cardoso Garcia.

Dissertação (mestrado) - Universidade Federal de Pernambuco. CIn, Ciência da Computação, 2013.

Inclui bibliografia. 1. Ciência da computação. 2. Sistemas distribuídos. I. Garcia, Vinicius Cardoso (orientador). II. Título. 004 CDD (23. ed.) MEI2013 – 054

Dissertação de Mestrado apresentada por Thiago Pereira de Brito Vieira à Pós-Graduação em Ciência da Computação do Centro de Informática da Universidade Federal de Pernambuco, sob o título “An Approach for Profiling Distributed Applications Through Network Traffic Analysis” orientada pelo Prof. Vinicius Cardoso Garcia e aprovada pela Banca Examinadora formada pelos professores:

______________________________________________

Prof. José Augusto Suruagy Monteiro Centro de Informática / UFPE

______________________________________________ Prof. Denio Mariz Timoteo de Souza Instituto Federal da Paraíba

_______________________________________________ Prof. Vinicius Cardoso Garcia Centro de Informática / UFPE

Visto e permitida a impressão.Recife, 5 de março de 2013

___________________________________________________Profa. Edna Natividade da Silva BarrosCoordenadora da Pós-Graduação em Ciência da Computação do Centro de Informática da Universidade Federal de Pernambuco.

Eu dedico esta dissertação aos meus pais, por me

ensinarem a sempre estudar e trabalhar para evoluir como

pessoa e profissional.

Agradecimentos

Primeiramente eu gostaria de agradecer a Deus pela vida, saúde e todas oportunidadescriadas em minha vida.

Agradeço aos meus pais, João e Ana, por todo o amor, carinho e incentivos para queEu possa sempre buscar crescimento pessoal e profissional, além de sempre me apoiaremnas minhas decisões e se mostrarem sempre preocupados e empenhados em me ajudar aalcançar meus objetivos.

Agradeço à Alynne, minha futura esposa, por todo o amor e paciência durante todonosso relacionamento, principalmente nestes dois intensos anos de mestrado, em queforam essenciais suas palavras de apoio nos momentos difíceis e sua descontração parame dar mais engergia e vontade de seguir com cada vez mais dedicação.

Agradeço à Agência Nacional de Telecomunicações - Anatel por permitir e pro-porcionar mais um aprendizado na minha vida. Gostaria de agradecer especialmentea Rodrigo Barbosa, Túlio Barbosa e Jane Teixeira por compreenderem e me apoiaremnesde desafio de cursar um mestrado. Agradeço a Marcio Formiga, pelo apoio antes edurante o mestrado, e pela compreensão do esforço necessário para vencer mais estedesafio. Agradeço a Wesley Paesano, Marcelo de Oliveira, Regis Novais e Danilo Balbypelo apoio e suporte para que eu pudesse me dedicar ao mestrado durante estes dois anos.Também agradeço aos amigos da Anatel, que de forma direta ou inditera me ajudarama enfrentar mais este desafio, dentre eles agradeço em especial a Ricardo de Holanda,Rodrigo Curi, Esdras Hoche, Francisco Paulo, Cláudio Moonen, Otávio Barbosa, HélioSilva, Bruno Preto, Luide Liude e Alexandre Augusto.

Agradeço a todos aqueles que me orientaram e forneceram algum ensinamentodurante este mestrado, em especial a Vinicius Garcia pelo acolhimento, apoio, orientações,cobranças e todos os importantes ensinamentos durante estes meses. Agradeço a StenioFernandes por todos os ensinamentos e orientações em momentos importantes da minhapesquisa. Agradeço a Rodrigo Assad pelo trabalho realizado em conjunto ao usto.re epelas orientações, que me nortearam no desenvolvimento da minha pesquisa. Agradeçoa Marcelo D’Amorim pelo acolhimento inicial e pelo trabalho que desempenhamosjuntos, que foi de grande valor para a minha inserção na pesquisa científica e para o meudesenvolvimento como pesquisador.

Agradeço a José Augusto Suruagy e Denio Mariz por aceitarem fazer parte da bancada minha defesa de dissertação e pelas valiosas críticas e contribuições para o meutrabalho.

Agradeço a todos os amigos que fiz durante este período de mestrado, que con-

vi

tribuiram para que estes dias dedicados ao mestrado fossem bastante agradáveis. Gostariade agradecer a Paulo Fernando, Lenin Abadie, Marco Machado, Dhiego Abrantes,Rodolfo Arruda, Francisco Soares, Sabrina Souto, Adriano Tito, Hélio Rodrigues, Jamil-son Batista, Bruno Felipe e demais pessoas que tive o prazer de conhecer durante esteperíodo do mestrado.

Também agradeço a todos os meus velhos amigos de João Pessoa, Geisel, UFPB eCEFET-PB, que tanto me deram apoio e incentivos para desenvolver este trabalho.

Finalmente, a todos aqueles que colaboraram direta ou indiretamente na realizaçãodeste trabalho.

Muito Obrigado!!!

vii

Wherever you go, go with all your heart.

—CONFUCIUS

Resumo

Sistemas distribuídos têm sido utilizados na construção de modernos serviços da Internete infraestrutura de computação em núvem, com o intuito de obter serviços com altodesempenho, escalabilidade e confiabilidade. Os acordos de níves de serviço adotadospela computação na núvem requerem um reduzido tempo para identificar, diagnosticare solucionar problemas em sua infraestrutura, de modo a evitar que problemas geremimpactos negativos na qualidade dos serviços prestados aos seus clientes. Então, adetecção de causas de erros, diagnóstico e reprodução de erros provenientes de sistemasdistribuídos são desafios que motivam esforços para o desenvolvimento de mecanismosmenos intrusivos e mais eficientes, para o monitoramento e depuração de aplicaçõesdistribuídas em tempo de execução.

A análise de tráfego de rede é uma opção para a medição de sistemas distribuídos,embora haja limitações na capacidade de processar grande quantidade de tráfego derede em curto tempo, e na escalabilidade para processar tráfego de rede sob variação dedemanda de recursos.

O objetivo desta dissertação é analisar o problema da capacidade de processamentopara mensurar sistemas distribuídos através da análise de tráfego de rede, com o intuitode avaliar o desempenho de sistemas distribuídos de um data center, usando hardwarenão especializado e serviços de computação em núvem, de uma forma minimamenteintrusiva.

Nós propusemos uma nova abordagem baseada em MapReduce para profundamenteinspecionar tráfego de rede de aplicações distribuídas, com o objetivo de avaliar odesempenho de sistemas distribuídos em tempo de execução, usando hardware nãoespecializado. Nesta dissertação nós avaliamos a eficácia do MapReduce para umalgoritimo de avaliação profunda de pacotes, sua capacidade de processamento, o ganhono tempo de conclusão de tarefas, a escalabilidade na capacidade de processamento, e ocomportamento seguido pelas fases do MapReduce, quando aplicado à inspeção profundade pacotes, para extrair indicadores de aplicações distribuídas.

Palavras-chave: Medição de Aplicações Distribuídas, Depuração, MapReduce, Análisede Tráfego de Rede, Análise em Nível de Pacotes, Análise Profunda de Pacotes

ix

Abstract

Distributed systems has been adopted for building modern Internet services and cloudcomputing infrastructures, in order to obtain services with high performance, scalability,and reliability. Cloud computing SLAs require low time to identify, diagnose and solveproblems in a cloud computing production infrastructure, in order to avoid negativeimpacts into the quality of service provided for its clients. Thus, the detection of errorcauses, diagnose and reproduction of errors are challenges that motivate efforts to thedevelopment of less intrusive mechanisms for monitoring and debugging distributedapplications at runtime.

Network traffic analysis is one option to the distributed systems measurement, al-though there are limitations on capacity to process large amounts of network trafficin short time, and on scalability to process network traffic where there is variation ofresource demand.

The goal of this dissertation is to analyse the processing capacity problem for mea-suring distributed systems through network traffic analysis, in order to evaluate theperformance of distributed systems at a data center, using commodity hardware and cloudcomputing services, in a minimally intrusive way.

We propose a new approach based on MapReduce, for deep inspection of distributedapplication traffic, in order to evaluate the performance of distributed systems at run-time, using commodity hardware. In this dissertation we evaluated the effectiveness ofMapReduce for a deep packet inspection algorithm, its processing capacity, completiontime speedup, processing capacity scalability, and the behavior followed by MapReducephases, when applied to deep packet inspection for extracting indicators of distributedapplications.

Keywords: Distributed Application Measurement, Profiling, MapReduce, NetworkTraffic Analysis, Packet Level Analysis, Deep Packet Inspection

x

Contents

List of Figures xiii

List of Tables xiv

List of Acronyms xv

1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.4 Dissertation Organization . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Background and Related Work 72.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.1 Network Traffic Analysis . . . . . . . . . . . . . . . . . . . . . 72.1.2 JXTA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.1.3 MapReduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.2.1 Distributed Debugging . . . . . . . . . . . . . . . . . . . . . . 132.2.2 MapReduce for Network Traffic Analysis . . . . . . . . . . . . 14

2.3 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3 Profiling Distributed Applications Through Deep Packet Inspection 173.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.3.1 Evaluation Methodology . . . . . . . . . . . . . . . . . . . . . 283.3.2 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.5.1 Results Discussion . . . . . . . . . . . . . . . . . . . . . . . . 343.5.2 Possible Threats to Validity . . . . . . . . . . . . . . . . . . . 35

3.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

xi

4 Evaluating MapReduce for Network Traffic Analysis 374.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.2.1 Evaluation Methodology . . . . . . . . . . . . . . . . . . . . . 394.2.2 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.4.1 Results Discussion . . . . . . . . . . . . . . . . . . . . . . . . 534.4.2 Possible threats to validity . . . . . . . . . . . . . . . . . . . . 56

4.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5 Conclusion and Future Work 585.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.2.1 Lessons Learned . . . . . . . . . . . . . . . . . . . . . . . . . 615.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

Bibliography 63

xii

List of Figures

2.1 Differences between packet level analysis and deep packet inspection . 82.2 MapReduce input dataset splitting into blocks and into records . . . . . 10

3.1 Architecture of the the SnifferServer to capture and store network traffic 213.2 Architecture for network traffic analysis using MapReduce . . . . . . . 233.3 JXTA Socket trace analysis . . . . . . . . . . . . . . . . . . . . . . . . 313.4 Completion time scalability of MapReduce for DPI . . . . . . . . . . . 32

(a) Scalability to process 16 GB . . . . . . . . . . . . . . . . . . . . 32(b) Scalability to process 34 GB . . . . . . . . . . . . . . . . . . . . 32

4.1 DPI Completion Time and Speed-up of MapReduce for 90Gb of a JXTA-application network traffic . . . . . . . . . . . . . . . . . . . . . . . . 43

4.2 DPI Processing Capacity for 90Gb . . . . . . . . . . . . . . . . . . . . 444.3 MapReduce Phases Behaviour for DPI of 90Gb . . . . . . . . . . . . . 45

(a) Phases Time for DPI . . . . . . . . . . . . . . . . . . . . . . . . 45(b) Phases Distribution for DPI . . . . . . . . . . . . . . . . . . . . 45

4.4 Completion time comparison of MapReduce for packet level analysis,evaluating the approach with and without splitting into packets . . . . . 47

4.5 CountUp completion time and speed-up of 90Gb . . . . . . . . . . . . 48(a) P3 evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48(b) CountUpDriver evaluation . . . . . . . . . . . . . . . . . . . . . 48

4.6 CountUp processing capacity for 90Gb . . . . . . . . . . . . . . . . . . 49(a) P3 processing capacity . . . . . . . . . . . . . . . . . . . . . . . 49(b) CountUpDriver processing capacity . . . . . . . . . . . . . . . . 49

4.7 MapReduce Phases time of CountUp for 90Gb . . . . . . . . . . . . . 50(a) MapReduce Phases Times of P3 . . . . . . . . . . . . . . . . . . 50(b) MapReduce Phases Times for CountUpDriver . . . . . . . . . . . 50

4.8 MapReduce Phases Distribution for CountUp of 90Gb . . . . . . . . . 51(a) Phases Distribution for P3 . . . . . . . . . . . . . . . . . . . . . 51(b) Phases Distribution for CountUpDriver . . . . . . . . . . . . . . 51

4.9 MapReduce Phases Distribution for CountUp of 90Gb . . . . . . . . . 52(a) DPI Completion Time and Speed-up of MapReduce for 30Gb of a

JXTA-application network traffic . . . . . . . . . . . . . . . . . 52(b) DPI Processing Capacity of 30Gb . . . . . . . . . . . . . . . . . 52

xiii

List of Tables

3.1 Metrics to evaluate MapReduce effectiveness and completion time scala-bility for DPI of a JXTA-based network traffic . . . . . . . . . . . . . . 28

3.2 Factors and levels to evaluate the defined metrics . . . . . . . . . . . . 293.3 Hypotheses to evaluate the defined metrics . . . . . . . . . . . . . . . . 293.4 Hypothesis notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.5 Completion time to process 16 GB split into 35 files . . . . . . . . . . . 333.6 Completion time to process 34 GB split into 79 files . . . . . . . . . . . 33

4.1 Metrics for evaluating MapReduce for DPI and packet level analysis . . 404.2 Factors and Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.3 Non-Distributed Execution Time in seconds . . . . . . . . . . . . . . . 43

xiv

List of Acronyms

DPI Deep Packet Inspection

EC2 Elastic Compute Cloud

GQM Goal Question Metric

HDFS Hadoop Distributed File System

IP Internet Protocol

I/O Input/Output

JVM Java Virtual Machine

MBFS Message Based Per Flow State

MBPS Message Based Per Protocol State

PBFS Packet Based Per Flow State

PBNS Packet Based No State

PCAP Packet Capture

PDU Protocol Data Unit

POSIX Portable Operating System Interface

RTT Roud-Trip Time

SLA Service Level Agreement

TCP Transmission Control Protocol

UDP User Datagram Protocol

xv

1Introduction

Though nobody can go back and make a new beginning, anyone can

start over and make a new ending.

—CHICO XAVIER

1.1 Motivation

Distributed systems has been adopted for building high performance systems, due to thepossibility of obtaining high fault tolerance, scalability, availability and efficient use ofresources (Cox et al., 2002; Antoniu et al., 2007). Modern Internet services and cloudcomputing infrastructures are commonly implemented as distributed systems, to provideservices with high performance and reliability (Mi et al., 2012). Cloud computing SLAsrequire low time to identify, diagnose and solve problems in its production infrastructure,in order to avoid negative impacts and problems into the quality of service provided for itsclients. Thus, monitoring and performance analysis of distributed systems at productionenvironment, became more necessary with the growth of cloud computing and the use ofdistributed systems to provide services and infrastructure as a service (Fox et al., 2009;Yu et al., 2011).

On distributed systems developing, maintaining and administration, the detection oferror causes, diagnosis and reproduction of errors are challenges that motivate effortsto the development of less intrusive and more effective mechanisms for monitoringand debugging distributed applications at runtime (Armbrust et al., 2010). Distributedmeasurement systems (Massie et al., 2004) and log analysers (Oliner et al., 2012) providerelevant information regarding some aspects of a distributed system, but this informationcan be complemented by correlated information from others sources (Zheng et al., 2012),

1

1.1. MOTIVATION

such as network traffic analysis, which can provide valuable information of a distributedapplication and its environment, and also increase the number of information sourcesto make them more effective for evaluating complex distributed systems. Simulators(Paul, 2010), emulators or testbeds (Loiseau et al., 2009; Gupta et al., 2011) are also usedto evaluate distributed systems, but these approaches present lacks of to reproduce theproduction behavior of a distributed system, and its relation within a complex environment,such as the cloud computing environment (Loiseau et al., 2009; Gupta et al., 2011).

Monitoring and diagnosing production failures of distributed systems require lowintrusion, high accuracy and fast results. It is complex to achieve these requirements,because distributed systems are usually composed of asynchronous communication,unpredictability of network message issues, high number of resources to be monitoredin short time, and black box components (Yuan et al., 2011; Nagaraj et al., 2012). Tomeasure distributed systems with less intrusion and less dependency on developers,approaches with low dependency on source code or instrumentation are necessary, suchas log analysis or network traffic analysis (Aguilera et al., 2003).

It is possible to measure, evaluate and diagnose distributed applications through theevaluation of information from communication protocols, flows, throughput and loaddistribution (Mi et al., 2012; Nagaraj et al., 2012; Sambasivan et al., 2011; Aguilera et al.,2003; Yu et al., 2011). This information can be collected through network traffic analysis,but to retrieve this kind of information from distributed application traffic it is necessaryto recognize application protocols and perform DPI to retrieve details of the applicationbehaviors, sessions, and states.

Network traffic analysis is one option to evaluate distributed systems’ performance(Yu et al., 2011), although there are limitations on processing capacity to deal with largeamounts of network traffic in short time, on scalability to process network traffic overvariation of resource demands, and on complexity to obtain information of a distributedapplication behavior from network traffic (Loiseau et al., 2009; Callado et al., 2009). Toevaluate application’s information from network traffic it is necessary to use DPI andextract information from application protocols, which requires an additional effort incomparison with traditional approaches of DPI, which usually do not evaluate content ofapplication protocols and application states.

In the production environment of a cloud computing provider, DPI can be used toevaluate and diagnose distributed applications, through the analysis of application trafficinside a data center. However, this kind of DPI presents differences and requires moreeffort than common DPI approaches. DPI is usually used to inspect all network traffic that

2

1.1. MOTIVATION

arrives at a data center, but this approach would not provide reasonable performance forinspecting application protocols and their states, due to the massive volumes of networktraffic to be online evaluated, and the computational cost to perform this kind of evaluationin short time (Callado et al., 2009).

Packet level analysis can also be used to evaluate packet flows and load distribution ofnetwork traffic inside a data center (Kandula et al., 2009), providing valuable informationabout the behavior of a distributed system and about the dimension, capacity and usageof network resources. However, with packet level analysis it is not possible to evaluateapplication messages, protocols, and their states.

Although much work has been done to improve DPI performance (Fernandes et al.,2009; Antonello et al., 2012), the evaluation of application states through traffic analysisdecreases the processing capacity of DPI to evaluate large amounts of network traffic.With the growth of link speeds, Internet traffic exchange and use of distributed systemsto provide Internet services (Sigelman et al., 2010), the development of approaches areneeded to be able to deal with the analysis of the growing amount of network traffic, topermit the efficient evaluation of distributed systems through network traffic analysis.

MapReduce (Dean and Ghemawat, 2008), which was proposed for distributed pro-cessing of large datasets, can be an option to deal with large amounts of network traffic.MapReduce is a programming model and an associated implementation for processingand generating large datasets. It becomes an important programming model and distribu-tion platform to process large amounts of data, with diverse use cases in academia andindustry (Zaharia et al., 2008; Guo et al., 2012). MapReduce is a restricted programmingmodel to easily and automatically parallelize the execution of user functions and toprovide transparent fault-tolerance (Dean and Ghemawat, 2008). Based on functionalcombinators from functional languages, it provides a simple programming paradigm forparallel processing that is increasingly being used for data-intensive applications in cloudcomputing environments.

MapReduce can be used for network packet level analysis (Lee et al., 2011), whichevaluates each packet individually to obtain information of network and transport layers.Lee et al. (2011) proposed an approach to perform network packet level analysis throughMapReduce, using network traces split into packets to process each one individuallyand to extract indicators from IP, TCP, and UDP. However, for profiling an applicationthrough network traffic analysis it is necessary to perform a deep packet inspection, inorder to evaluate the content of the application layer, and to evaluate application protocolsand reassemble application messages.

3

1.2. PROBLEM STATEMENT

Because the approach proposed by Lee et al. (2011) is not able to evaluate more thanone packet per MapReduce iteration and analyse application messages, it is necessary anew MapReduce approach to perform DPI algorithms for profiling applications throughnetwork traffic analysis.

The kind of workload submitted for processing by MapReduce impacts on the be-haviour and performance of MapReduce (Tan et al., 2012; Groot, 2012), requiring specificconfiguration to obtain an optimal performance. Information about the occupation ofMapReduce phases, about the processing characteristics (if the job is I/O or CPU bound),and about the mean time duration of Map and Reduce tasks, can be used to optimizeparameter configurations of the MapReduce, in order to improve resource allocation andtask scheduling.

Although studies has been done to understand, analyse and improve workload man-agement decisions in MapReduce (Lu et al., 2012; Groot, 2012), there is no evaluation tocharacterize the MapReduce behaviour or to identify its optimal configuration to achievethe best performance for packet level analysis and DPI.

1.2 Problem Statement

MapReduce can express several kinds of problems, but not all. MapReduce does not effi-ciently express incremental, dependent or recursive data (Bhatotia et al., 2011; Lin, 2012),because its approach adopts batch processing and functions executed independently,without shared state or data. Although MapReduce is restrictive, it provides a good fitfor many problems of processing large datasets. MapReduce expressiveness limitationsmay be reduced by decomposition of problems into multiple MapReduce iterations, orby combining MapReduce with others programming models for sub-problems (Lämmel,2007; Lin, 2012), although the decomposition into interactions increases the completiontime of MapReduce jobs (Lämmel, 2007).

DPI algorithms require the evaluation of one or more packets to retrieve informa-tion from application layer messages; this represents a data dependency to mount anapplication message from network packets, and it is a restriction to use MapReduce forDPI. Because Lee et al. (2011)’s approach for MapReduce performs packet level analysisprocesses each packet individually, it can not be used to evaluate more than one packetper MapReduce Map function and efficiently reassemble an application message fromnetwork traces. Thus it is necessary a new approach to use MapReduce to perform DPI,evaluating the effectiveness of MapReduce to express DPI algorithms.

4

1.3. CONTRIBUTIONS

In elastic environments, like cloud computing providers, where users can request ordiscard resources dynamically, it is important to know how to make provisioning andresource allocation in an optimal way. To run MapReduce jobs efficiently, the allocatedresources need to be matched to the workload characteristics, and the allocated resourcesshould be sufficient to meet a requested processing capacity or deadline (Lee, 2012).

The main performance evaluations of MapReduce are about text processing (Zahariaet al., 2008; Chen et al., 2011; Jiang et al., 2010; Wang et al., 2009), where the inputdata are split into blocks and into records, to be processed by parallel and independentMap functions. Although studies has been done in order to understand, analyse andimprove workload decisions in MapReduce (Lu et al., 2012; Groot, 2012), there is noevaluation to characterize the MapReduce behavior or to identify its optimal configurationto achieve the best performance for packet level analysis and DPI. Thus, it is necessary thecharacterization of MapReduce jobs for packet level analysis and DPI, in order to permitits optimal configuration to achieve the best performance, and to obtain information thatcan be used to predict or simulate the completion time of a job with given resources, inorder to determine whether the job will be finished by the deadline with the allocatedresources (Lee, 2012).

The goal of this dissertation is to analyse the processing capacity problem for mea-suring distributed systems through network traffic analysis, proposing a solution able toperform deep inspection in distributed applications traffic, in order to evaluate distributedsystems at a data center, using commodity hardware and cloud computing services, in aminimally intrusive way. Thus we developed an approach based on MapReduce to evalu-ate the behavior of distributed systems through DPI, and we evaluated the effectiveness ofMapReduce to a DPI algorithm and its completion time scalability through node additioninto the cluster, to measure a JXTA-based application, using virtual machines of a cloudcomputing provider. Also we evaluated the MapReduce performance for packet levelanalysis and DPI, characterizing the behavior followed by MapReduce phases, processingcapacity scalability and speed-up. In this evaluation we evaluated the impact caused bythe variation of input size, block size and cluster size.

1.3 Contributions

We analyse the processing capacity problem of distributed system measurements throughnetwork traffic analysis. The results of the work presented in this dissertation provide thefollowing contributions:

5

1.4. DISSERTATION ORGANIZATION

1. We proposed an approach to implement DPI algorithms through MapReduce,using whole blocks as input for Map functions. Was shown the effectiveness ofMapReduce for a DPI algorithm to extract indicators from a distributed appli-cation traffic, also it was shown the MapReduce completion time scalability,through node addition into the cluster, for DPI on virtual machines of a cloudcomputing provider;

2. We characterized the behavior followed by MapReduce phases for packetlevel analysis and DPI, showing that this kind of job is intense in Map phaseand highlighting points for improvement;

3. We described the processing capacity scalability of MapReduce for packetlevel analysis and DPI, evaluating the impact caused by variations in input,cluster and block size;

4. We showed the speed-up obtained with MapReduce for DPI, with variations ofinput, cluster and block size.

1.4 Dissertation Organization

The remainder of this dissertation is organized as follows.In Chapter 2, we provide the background information on network traffic analysis and

MapReduce, we also investigate previous work that are related to the measurement ofdistributed applications at runtime and with the use of MapReduce for network trafficanalysis.

In Chapter 3, we look at the problem of distributed application monitoring andrestriction to use MapReduce for profiling application traffic. There are limitations oncapacity to process large amounts of network packet in short time and on scalability tobe able to process network traffic where there are variations of throughput and resourcedemand. To address this problem, we present an approach for profiling application trafficusing MapReduce. Experiments show the effectiveness of our approach for profilingapplication through DPI and MapReduce, and shows the achieved completion timescalability in a cloud computing provider.

In Chapter 4, we performed a performance evaluation of MapReduce for networktraffic analysis. Due to the lack of evaluation of MapReduce for traffic analysis andthe peculiarity of this kind of data, this chapter deeply evaluates the performance ofMapReduce for packet level analysis and DPI of distributed application traffic, evaluating

6

1.4. DISSERTATION ORGANIZATION

the MapReduce scalability, speed-up and behavior followed by MapReduce phases.The experiments evidence the predominant phases in this kind of MapReduce job, andshow the impact caused by the input size, block size and number of nodes, into the jobcompletion time and scalability achieved through the use of MapReduce.

In Chapter 5 we conclude the work done, summarize our contributions and presentfuture work.

7

2Background and Related Work

No one knows it all. No one is ignorant of everything. We all know

something. We are all ignorant of something.

—PAULO FREIRE

In this chapter, we provide background information on network traffic analysis, JXTAand MapReduce, we also investigate previous studies that are related to the measurementof distributed applications and to the use of MapReduce for network traffic analysis.

2.1 Background

2.1.1 Network Traffic Analysis

Network traffic measurement can be divided into active or passive measurement, anda measurement can be performed at packet or flow levels. In packet level analysis,the measurements are performed on each packet transmitted across the measurementpoint. The common packet inspection only analyses the content up to the transport layer,including the source address, destination address, source port, destination port and theprotocol type, but packet inspection can also analyse the packet payload, performing adeep packet inspection.

Risso et al. (2008) presented a taxonomy of the methods that can be used for networktraffic analysis. According to Risso et al. (2008), the Packet Based No State (PBNS) oper-ates by checking the value of some fields present in each packet, such as the TCP or UDPports, thus this method is very simple computationally. The Packet Based Per Flow State(PBFS) requires a session table to manage session identification (source/destination ad-dress, transport-layer protocol, source/destination port) and the corresponding application

8

2.1. BACKGROUND

layer protocol, in order to be able to scan the payload looking for a specific rule, whichusually is an application-layer signature, which increases the processing complexity ofthis method. The Message Based Per Flow State (MBFS) operates on messages insteadof packets. This method requires a TCP/IP reassembler to handle IP fragments and TCPsegments. In such case, memory requirements increase because of the additional stateinformation that must be kept for each session and because of buffers required by theTCP/IP reassembler. The Message Based Per Protocol State (MBPS) interprets exactlywhat each application sends and receives. A MBPS processor understands not only thesemantic of the message, but also the different phases of a messages exchange because ithas a full understanding of the protocol state machine. Memory requirements becomeeven larger, because this method needs to take into account not only the state of thetransport session, but also the state of each application layer session. Also processingpower is the highest because the protocol conformance analysis requires processing theentire application data, while previous methods are limited to the first packets within eachsession.

The Figure 2.1 illustrates the difference between packet level analysis and DPI fromPCAP files, and shows that packet level analysis evaluates each packet individually, whileDPI requires an evaluation of more than one packet to reassemble some packets andobtain an application message.

Figure 2.1 Differences between packet level analysis and deep packet inspection

DPI refers for examining both packet header and complete payload to look forpredefined patterns or rules. A pattern or rule can be a particular TCP connection, definedby source and destination IP addresses and port numbers, it can also be a signature stringof a virus, or a segment of malicious code (Piyachon and Luo, 2006). Antonello et al.

9

2.1. BACKGROUND

(2012) argues that many critical network services rely on the inspection of packet payload,instead of only looking at the information of packet headers. Although DPI systemsare essentially more accurate to identify application protocols and application messages,they are also resource-intensive and may not scale well with the growing link speeds.MBFS, MBPS and DPI evaluate the content of the application layer, thus it is necessaryto recognize the content of the message evaluated, but encrypted messages can makethese kind of evaluation infeasible.

2.1.2 JXTA

JXTA is a language and specification for peer to peer networking, it attempts to formulatepeer to peer standard protocols, in order to provide an infrastructure for building peerto peer applications, through basic functionalities for peer resource discovery, commu-nication and organization. JXTA introduces an overlay on top of the existing physicalnetwork, with its own addressing and routing (Duigou, 2003; Halepovic and Deters,2003).

According to JXTA specification (Duigou, 2003), JXTA peers communicate throughmessages transmitted by pipes, which are an abstraction of virtual channels composed ofinput and output channels, for peer to peer communication. Pipes are not bound to thephysical location, it has its own unique ID. Each peer can carry its pipe with itself evenwhen its physical network location changes. Pipes are asynchronous, unidirectional andunreliable, but bi-directional and reliable services are provided on top of them. JXTAuses source-based routing, each message carries its routing information as a sequenceof peers, and peers along the path may update this information. The JXTA socket addsreliability and bi-directionality to JXTA communications through one layer of abstractionon top of the pipes (Antoniu et al., 2005), and it provides an interface similar to thePOSIX sockets specification. JXTA messages are XML-documents composed of welldefined and ordered message elements.

Halepovic and Deters (2005) proposed a performance model, describing importantmetrics to evaluate the JXTA throughput, scalability, services and the JXTA behaviorover different versions. Halepovic et al. (2005) analysed the JXTA performance inorder to show the increasing cost or latency with higher workload and with concurrentrequests, and suggests more evaluations about JXTA scalability with large peer groups indirect communication. Halepovic (2004) cites that network traffic analysis is a feasibleapproach to performance evaluation of JXTA-based applications, but do not adopt it dueto the lack on JXTA traffic characterization. Although there are performance models and

10

2.1. BACKGROUND

evaluations of JXTA, there are no evaluations of it for the current versions and there arenot mechanisms to evaluate JXTA applications at runtime. Because JXTA is still used forbuilding peer to peer systems, such as the U-Store (Fonseca et al., 2012), which motivatesour research, is necessary a solution to measure JXTA-based applications at runtime andprovide information about their behavior and performance.

2.1.3 MapReduce

MapReduce (Dean and Ghemawat, 2008) is a programming model and a framework forprocessing large datasets trough distributed computing, providing fault tolerance and highscalability to big data processing. The MapReduce model was designed for unstructureddata processed by clusters of commodity hardware. Its functional style of Map andReduce functions automatically parallelizes and executes large jobs in a cluster. Also,MapReduce handles failures, application deployment, task duplications, and aggregationof results, thereby allowing programmers to focus on the core logic of applications.

An application executed through MapReduce is called job. The input data of ajob, which is stored into a distributed file system, it is split into even-sized blocks andreplicated for fault tolerance. Figure 2.2 shows the dataset input splitting adopted byMapReduce.

Figure 2.2 MapReduce input dataset splitting into blocks and into records

11

2.1. BACKGROUND

Initially the input dataset is split into blocks and stored into the distributed file systemadopted. During the job execution of a dataset, each split is assigned to be processed by aMapper, thus the number of splits of the input determines the number of Map tasks of aMapReduce job. Each Mapper reads its split from the distributed file system and dividesit into records, to be processed by the user-defined Map function. Each Map functiongenerates intermediate data from the evaluated block, which will be fetched, ordered bykeys and processed by the Reducers to generate the output of a MapReduce job.

A MapReduce job is divided into Map and Reduce tasks, which are composed ofuser-defined functions of Map and Reduce. The execution of these tasks can be groupedinto phases, representing the Map and Reduce phases, but Reduce tasks still can bedivided into other phases, which are the Shuffle and Sort phases. A job is submitted byan user to the master node, which selects worker nodes with idle slots and assigns Map orReduce tasks.

The execution of a Map task can be divided into two phases. In the first, the Mapphase reads the task’s split from the distributed file system, parses it into records, andapplies the user-defined Map function to each record. In the second, after the user-definedMap function has been applied to each input record, the commit phase registers the finaloutput with the TaskTracker, which then informs the JobTracker that the task has finishedexecuting. The output of the Map phase is consumed by the Reduce phase.

The execution of a Reduce tasks can be divided into three phases. The first phase,called Shuffle phase, fetches the Reduce task’s input data, where for each Reduce task isassigned a partition of the key produced by the Map phase. The second phase, called Sortphase, groups records with the same key. The third phase, called Reduce phase, appliesthe user-defined Reduce function to each key and its values (Kavulya et al., 2010).

A Reduce task cannot fetch the output of a Map task until the Map has finished andcommitted its output to disk. Only after receiving its partition from all Map outputs, theReduce task starts the Sort phase, while this does not happens, the Reduce task executesthe Shuffle phase. After the Sort phase, the Reduce task enters the Reduce phase, inwhich it executes the user-defined Reduce function for each key and its values. Finallythe output of the Reduce function is written to a temporary location on the distributed filesystem (Condie et al., 2010).

MapReduce worker nodes are configurable to concurrently execute up to a definednumber of Map and Reduce tasks, which are defined according to the number of Map andReduce slots. Each worker node of a MapReduce cluster is configured with a fixed numberof Map slots, and another fixed number of Reduce slots, which means the number of Map

12

2.1. BACKGROUND

or Reduce tasks that can be executed concurrently per node. During job executions, ifall possible slots are occupied, pending tasks must wait until some slots are freed up. Ifthe number of tasks in the job is bigger than the number of slots available, then Maps orReduces are first scheduled to execute on all available slots, and these tasks compose thefirst wave of tasks, that is followed by subsequent waves. If an input is broken into 200blocks and there are 20 Map slots in a cluster, the number of map tasks are 200 and themap tasks are executed through 10 waves of executions (Lee et al., 2012). The number ofwaves, and the sizes of waves, would aid the configuration of tasks for improved clusterutilization (Kavulya et al., 2010).

The Shuffle phase of the first Reduce wave may be significantly different from theShuffle phase that belongs to the next Reduce waves. This happens because the Shufflephase of the first Reduce wave overlaps with the entire Map phase, and hence its dependson the number of Map waves and their durations (Verma et al., 2012b).

Each Map task is independent of the others Map tasks, meaning that all Mappers canbe performed in parallel on multiple machines. The number of concurrent Map tasks in aMapReduce system is limited by the number of slots and the number of blocks in whichthe input data was divided. Reduce tasks can also be performed in parallel during theReduce phase, and the number of reduce tasks in a job is specified by the application andby the number of Reduce slots per node.

MapReduce tries to achieve data locality for its job executions, which means the Maptask and the input data block it will process should be located as close to each other aspossible, in order for the Map task can read the input data block incurring as little networktraffic as possible.

Hadoop1 is an open source implementation of MapReduce, which relies on HDFSfor distributed data storage and replication. HDFS is an implementation of Google FileSystem (Ghemawat et al., 2003), which was designed to store large files, and was adoptedby MapReduce system as distributed file system to store its files and intermediate data.

The input data type and workload characteristics cause impact into the MapReduceperformance, due to each application has a different bottleneck resource, and requiresspecific configuration to achieve optimal resource utilization (Kambatla et al., 2009).Hadoop has a set of parameters for its configuration, the default values of these parametersare based on typical configuration of machines in clusters and requirements of a typicalapplication, that usually processes text-like inputs, although the MapReduce optimalresource utilization is dependent on the resource consumption profile of its application.

1http://hadoop.apache.org/

13

2.2. RELATED WORK

Because the input data type and workload characteristics of MapReduce jobs impactsinto MapReduce performance, it is necessary to evaluate the MapReduce behavior andperformance for different purposes. Although much work has been done in order tounderstand and analyse MapReduce for different input data types and workloads (Luet al., 2012; Groot, 2012), there is no evaluation to characterize the MapReduce behaviorand identify its optimal configuration for an application to packet level analysis and DPI.

2.2 Related Work

2.2.1 Distributed Debugging

Modern Internet services are often implemented as complex, large-scale distributedsystems. Information about the behavior of complex distributed systems is necessaryto evaluate and improve their performance, but for understanding distributed systembehavior it is required to observe related activities across many different components andmachines (Sigelman et al., 2010).

The evaluation of distributed applications is a challenge, due to the cost of monitoringdistributed systems and the lack of performance measurement of large scale distributedapplications at runtime. To reproduce the behavior of a complex distributed system, ina test environment, it is necessary to reproduce each relevant configuration parameterof the system (Gupta et al., 2011), which is a difficult effort, and is more evident andcomplex in cases where faults only occurs when the system is over a high load (Loiseauet al., 2009).

Gupta et al. (2011) presented a methodology and framework for large scale tests, ableto obtain resource configurations and scale near a large scale system, through the use ofemulated scalable network, multiplexed virtual machines and resource dilatation. Guptaet al. (2011) shows its accuracy, scalability and the realism on network tests. However itcan not obtain the same accuracy of an evaluation of a real system at runtime, neither candiagnose a problem occurred in production environment, in short time.

According to Sambasivan et al. (2011), debugging tools are needed to help theidentification and understanding of root causes of the diverse performance problems thatcan arise in distributed systems. A request-flow can be seen as path and timing of arequest in a distributed system, representing the the flow of individual requests withinand across the components of a distributed system. There are many cases for whichrequest-flow traces comparison is useful; it can help to diagnose performance changes

14

2.2. RELATED WORK

resulting from modifications made during software development or from upgrades of adeployed system. It can also help to diagnose behaviour changes resulted from componentdegradations, resource leakage, or workload changes.

Sigelman et al. (2010) reported Dapper, a large production distributed system tracingframework of Google, that states three concrete design goals: low overhead, application-level transparency and scalability. These goals were achieved by restricting Dapper’score tracing instrumentation to an ubiquitous threading, control flow, and RPC librarycode of Google. Dapper provides valuable insights about the evaluation of distributedsystems through flows and procedure calls, but its implementation is dependent of theinstrumentation into the component responsible for message communication of thedistributed system, what can not be available in a black box system.

Some techniques has been developed for performance evaluation of distributed sys-tems. Mi et al. (2012) proposed an approach, based on end-to-end request trace logs, toidentify primary causes of performance problems in cloud computing systems. Nagarajet al. (2012) compared logs of distributed systems to diagnose performance problems,using machine learning techniques to analyse logs and to explore information of statesand event times. Sambasivan et al. (2011) used request flows to find performance mod-ifications in distributed systems, comparing request flows across periods and rankingthem based on their impact in system’s performance. Although these approaches evaluaterequests, flows and events of distributed systems, traffic analysis was not used as anapproach to provide de desired information.

Aguilera et al. (2003) proposed an approach to isolate performance bottlenecks indistributed systems, based in message-level traces activity and algorithms for inferringthe dominant paths of a distributed system. Although network traffic was considered assource to extract the desired information, a distributed approach was not adopt for dataprocessing.

Yu et al. (2011) presented SNAP, a scalable network-application profiler to evaluatethe interactions between applications and the network. SNAP passively collects TCPstatistics and socket logs, and correlates them with network resources to indicate prob-lem locations. However, SNAP did not adopted application traffic evaluation, neitherdistributed computing to perform network traffic processing.

2.2.2 MapReduce for Network Traffic Analysis

Lee et al. (2010) proposed a network flow analysis method using MapReduce, where thenetwork traffic was captured, converted to text and used as input to Map tasks. As a result,

15

2.3. CHAPTER SUMMARY

it was shown improvements in fault tolerance and computation time, when compared withflow-tools2. The conversion time from binary network traces to text represents a relevantadditional time, that can be avoided adopting binary data as input data for MapReducejobs.

Lee et al. (2011) presented a Hadoop-based packet trace processing tool to processlarge amounts of binary network traffic. A new input type to Hadoop was developed,the PcapInputFormat, which encapsulate the complexity of processing a captured binaryPCAP traces and extracting the packets through the Libpcap (Jacobson et al., 1994)library. Lee et al. (2011) compared their approach with CoralReef3, which is a networktraffic analysis tool that also relies on Libpcap, the results of the evaluation showedspeed-up on completion time, for a case that process packet traces with more than 100GB.This approach implemented a packet level evaluation, to extract indicators from IP, TCPand UDP, evaluating the job completion time achieved with different input size and twocluster configurations. It was implemented their own component to save network tracesinto blocks, and the developed PcapInputFormat rely on a timestamp-based heuristic forfinding the first packet from each block, using sliding-window. These implementations toiterate over packets of a network trace, can present a limitation on accuracy, if comparedwith the accuracy obtained by Tcpdump4 and LibPCAP for the same functionalities.

The approach proposed by Lee et al. (2011) is not able to evaluate more than onepacket per MapReduce iteration, because each block is divided into packets that areevaluated individually by the user-defined Map function. Therefore, a new MapReduceapproach is necessary to perform DPI algorithms, which requires to reassemble morethan one packet to mount an application message, in order to evaluate message contents,application states and application protocols.

2.3 Chapter Summary

In this chapter, we presented the background information of network traffic analysis,JXTA and MapReduce, we also investigated previous studies that are related to themeasurement of distributed applications and related to the use of MapReduce for networktraffic analysis.

According to the background and related work evaluated, the detection of error causes,diagnose and reproduction of errors of distributed systems are challenges that motivate

2www.splintered.net/sw/flow-tools/3http://www.caida.org/tools/measurement/coralreef4http://www.tcpdump.org/

16


efforts to develop less intrusive mechanisms for monitoring and debugging distributedapplications at runtime. Network traffic analysis is one option to distributed systemsmeasurement, although there are limitations on capacity to process large amounts ofnetwork traffic in short time, and on scalability to process network traffic where there isvariation of resource demand.

AlthoughMapReduce can be used for packet level analysis, it is necessary an approachto use MapReduce for DPI, in order to evaluate distributed systems at a data center throughnetwork traffic analysis, using commodity hardware and cloud computing services, in aminimally intrusive way. Due to the lack of evaluation of MapReduce for traffic analysisand the peculiarity of this kind of data, it is necessary to evaluate the performance ofMapReduce for packet-level analysis and DPI, characterizing the behavior followed byMapReduce phases, its processing capacity scalability and speed-up, over variations ofthe most important configuration parameters of MapReduce.

17

3Profiling Distributed ApplicationsThrough Deep Packet Inspection

Life is really simple, but we insist on making it complicated.

—CONFUCIUS

In this chapter, we first look at the problems in the distributed application monitoring,processing capacity of network traffic, and in the restriction to use MapReduce forprofiling application network traffic of distributed applications.

Network traffic analysis can be used to extract performance indicators from commu-nication protocols, flows, throughput and load distribution of a distributed system. Inthis context, network traffic analysis can enrich diagnoses and provide a mechanism formeasuring distributed systems in a passive way, with low overhead and low dependencyon developers.

However, there are limitations on the capacity to process large amounts of networktraffic in short time, and on processing capacity scalability to be able to process networktraffic over variations of throughput and resource demands. To address this problem, wepresent an approach for profiling application network traffic using MapReduce. Exper-iments show the effectiveness of our approach for profiling a JXTA-based distributedapplication through DPI, and its completion time scalability through node addition, in acloud computing environment.

In Section 3.1 we begin this chapter by motivating the need for an approach usingMapReduce for DPI, then we describe, in Section 3.2, the architecture proposed and theDPI algorithm to extract indicators from network traffic of a JXTA-based distributedapplication. Section 3.3 presents the adopted evaluation methodology and the experiment

18

3.1. MOTIVATION

setup used to evaluate our proposed approach. The obtained results are presented inSection 3.4 and discussed in Section 3.5. Finally, Section 3.6 concludes and summarizesthis chapter.

3.1 Motivation

Modern Internet services and cloud computing infrastructure are commonly implementedas distributed systems, to provide services with high performance, scalability and reliabil-ity. Cloud computing SLAs require a short time to identify, diagnose and solve problemsin its infrastructure, in order to avoid negative impacts and problems in the providedquality of service.

Monitoring and performance analysis of distributed systems became more necessarywith the growth of cloud computing and the use of distributed systems to provide servicesand infrastructure (Fox et al., 2009). In distributed systems development, maintenance andadministration, the detection of error causes, and the diagnosing and reproduction of errorsare challenges that motivates efforts to develop less intrusive mechanisms for debuggingand monitoring distributed applications at runtime (Armbrust et al., 2010). Distributedmeasurement systems (Massie et al., 2004) and log analyzers (Oliner et al., 2012) providerelevant information of some aspects of a distributed system. However this informationcan be complemented by correlating information from network traffic analysis, makingthem more effective and increasing the information source to ubiquitously evaluate adistributed system.

Low overhead, and transparency and scalability are commons requirements for anefficient solution to the measurement of distributed systems. Many approaches have beenproposed in this direction, using instrumentation or logging, which cause overhead anda dependency on developers. It is possible to diagnose and evaluate distributed applica-tions’ performance with the evaluation of information from communication protocols,flows, throughput and load distribution (Sambasivan et al., 2011; Mi et al., 2012). Thisinformation can be collected through network traffic analysis, enriching a diagnosis, andalso providing an approach for the measurement of distributed systems in a passive way,with low overhead and low dependency on developers.

Network traffic analysis is one option to evaluate distributed systems performance(Yu et al., 2011), although there are limitations on the capacity to process large numberof network packets in a short time (Loiseau et al., 2009; Callado et al., 2009) and onscalability to process network traffic over variations of throughput and resource demands.

To obtain information of the behaviour of distributed systems, from network traffic, it

19

3.1. MOTIVATION

is necessary to use DPI and evaluate information from application states, which requiresan additional effort in comparison with traditional approaches of DPI, which usually donot evaluate application states.

Although much work has been done in order to improve the DPI performance (Fernan-des et al., 2009; Antonello et al., 2012), the evaluation of application states still decreasesthe processing capacity of DPI to evaluate large amounts of network traffic. With thegrowth of links’ speed, Internet traffic exchange and the use of distributed systems toprovide Internet services (Sigelman et al., 2010), the development of new approachesare needed to be able to deal with the analysis of the growing amount of network traffic,and to permit the efficient evaluation of distributed systems through the network trafficanalysis.

MapReduce (Dean and Ghemawat, 2008) becomes an important programming modeland distribution platform to process large amount of data, with diverse use cases inacademia and industry (Zaharia et al., 2008; Guo et al., 2012). MapReduce can be usedfor packet level analysis: Lee et al. (2011) proposed an approach which evaluates eachpacket individually to obtain information of network and transport layers. An approachto process large amount of network traffic using MapReduce was proposed by Lee et al.(2011), which splits network traces into packets to process each one individually andextract indicators from IP, TCP, and UDP.

However, for profiling distributed applications through network traffic analysis, it isnecessary to analyse the content of more than one packet, up to the application layer, toevaluate application messages and its protocols. Due to TCP and message segmentation,the desired application message may be split into several packets. Therefore, it isnecessary to evaluate more than one packet per MapReduce iteration to perform a deeppacket inspection, in order to be able to reassemble more than one packet and mountapplication messages, to retrieve information from the application sessions, states andfrom its protocols.

DPI refers for examining both packet header and complete payload to look forpredefined patterns or rules, which can be a signature string or an application message.According to the taxonomy presented by Risso et al. (2008), deep packet inspectioncan be classified as message based per flow state (MBFS), which analyses applicationmessages and its flows, and also can be classified as message based per protocol state(MBPS), which analyses application messages and its application protocol states, whatmakes necessary to evaluate distributed applications through network traffic analysis, toextract application indicators.

20

3.1. MOTIVATION

MapReduce is a restricted programming model to parallelize user functions auto-matically and to provide transparent fault-tolerance (Dean and Ghemawat, 2008), basedon functional combinators from functional languages. MapReduce does not efficientlyexpress incremental, dependent or recursive data (Bhatotia et al., 2011; Lin, 2012), be-cause its approach adopts batch processing and functions executed independently, withoutshared states.

Although restrictive, MapReduce provides a good fit for many problems of processinglarge datasets. Also, its expressiveness limitations may be reduced by problem decom-position into multiple MapReduce iterations, or by combining MapReduce with othersprogramming models for subproblems (Lämmel, 2007; Lin, 2012), but this approach canbe not optimal in some cases. DPI algorithms require the evaluation of one or more pack-ets to retrieve information from application messages; this represents a data dependenceto mount an application message and is a restriction on the use of MapReduce for DPI.

Because the Lee et al. (2011) approach processes each packet individually, it cannot be efficiently used to evaluate more than one packet and reassemble an applicationmessage from a network trace, which makes it necessary a new approach for usingMapReduce to perform DPI and to evaluate application messages.

To be able to process large amounts of network traffic using commodity hardware,in order to evaluate the behaviour of distributed systems at runtime, and also becausethere is no evaluation of MapReduce effectiveness and processing capacity for DPI, anapproach was developed based on MapReduce, to deeply inspect distributed applicationstraffic, in order to evaluate the behaviour of distributed systems, using Hadoop, an opensource implementation of MapReduce.

In this Chapter is evaluate the effectiveness of MapReduce to a DPI algorithm andits completion time scalability through node addition, to measure a JXTA-based applica-tion, using virtual machines of Amazon EC21, a cloud computing provider. The maincontributions of this chapter are:

1. To provide an approach to implement DPI algorithms using MapReduce;

2. To show the effectiveness of MapReduce for DPI;

3. To show the completion time scalability of MapReduce for DPI, using virtualmachines of cloud computing providers.

1http://aws.amazon.com/ec2/

21

3.2. ARCHITECTURE

3.2 Architecture

In this section we present the architecture of the proposed approach to capture and processnetwork traffic of distributed applications.

To monitor distributed applications through network traffic analysis, specifics pointsof a data center must be monitored to capture the desired application network traffic.Also, an approach is needed to process a large amount of network traffic in an acceptabletime. According to (Sigelman et al., 2010), fresh information enables a faster reaction toproduction problems, thereby the information must be obtained as soon as possible, al-though a trace analysis system operating on hours-old data is still valuable for monitoringdistributed applications in a data center (Sigelman et al., 2010).

In this direction, we propose a pipelined process to capture network traffic, storelocally, transfer to a distributed file system, and evaluate the network trace to extractapplication indicators. We use MapReduce, implemented by Apache Hadoop, to pro-cess application network traffic, extract application indicators, and provide an efficientand scalable solution for DPI and profiling application network traffic in a productionenvironment, using commodity hardware.

The architecture for network traffic capturing and processing is composed of fourmain components: the SnifferServer (Shown in Figure 3.1), that captures, splits and storesnetwork packets into the HDFS for batch processing through Hadoop; the Manager, thatorchestrates the collected data, the job executions and stores the results generated; theAppParser, that converts network packets into application messages; and the AppAnalyzer,that implements Map and Reduce functions to extract the desired indicators.

Figure 3.1 Architecture of the the SnifferServer to capture and store network traffic

22

3.2. ARCHITECTURE

Figure 3.1 shows the architecture of the SnifferServer and its placement into moni-toring points of a datacenter. SnifferServer captures network traffic from specific pointsand stores it into the HDFS, for batch processing through Hadoop. Sniffer executesuser-defined monitoring plans guided by specification of places, time, traffic filters andthe amount of data to be captured. According to an user-defined monitoring plan, Snifferstarts the capture of the desired network traffic through Tcpdump, which saves networktraffic in binary files, known as PCAP files. The collected traffic is split into files withpredefined size, saved at the local SnifferServer file system, and transferred to HDFSonly when each file is totally saved into the local file system of the SnifferServer. TheSnifferServer must be connected to the network where the monitoring target nodes areconnected, and must be able to establish communication with the others nodes thatcompose the HDFS cluster.

During the execution of a monitoring plan, initially the network traffic must becaptured, split into even-sized files and stored into HDFS. Through the Tcpdump, awidely used LibPCAP network traffic capture tool, the packets are captured and split intoPCAP files with 64MB of size, which is the default block size of the HDFS, although thisblock size may be configured to different values.

HDFS is optimized to store large files, but internally each file is split into blocks witha predefined size. Files that are greater than the HDFS block size must be split into blockswith size equal to or smaller than the adopted block size, and must be spread amongmachines in the cluster.

Because the LibPCAP, used by Tcpdump, stores the network packets in binary PCAPfiles and due to the complexity of providing to HDFS an algorithm for splitting PCAPfiles into packets, PCAP files splitting can be avoided through the adoption of files lessthan the HDFS block size, but also can be provided to Hadoop an algorithm to split PCAPfiles into packets, in order to be able to store PCAP files into the HDFS.

We adopted the approach that saves the network trace into PCAP files with the adoptedHDFS block size, using the split functionality provided by Tcpdump, because of thePCAP file split into packets demands additional computing time and because of the tracesplitting into packets increases the complexity of the system. Thus, the network trafficis captured by Tcpdump, split into even-sized PCAP files and stored into the local filesystem of the SnifferServer, and periodically transferred to HDFS, which is responsiblefor replicating the files into the cluster.

In the MapReduce framework, the input data is split into blocks, which are splitinto small pieces, called records, to be used as input for each Map function. We adopt

23

3.2. ARCHITECTURE

the use of entire blocks, with size defined by the HDFS block size, as input for eachMap function, instead of using the block divided into records. With this approach, it ispossible to evaluate more than one packet per MapReduce task and to be able to mount anapplication message from network traffic. Also it is possible to obtain more processingtime for the Map function than the approach where each Map function receives only onepacket as input.

Differently from the approach presented by Lee et al. (2011), which only permitsevaluation of a packet individually per Map function, with our approach it is possible toevaluate many packets from a PCAP file per Map function and to reassemble applicationmessages from network traffic, which had the content of its messages divided into manypackets to be transferred over TCP.

Figure 3.2 shows the architecture to process distributed application traffic throughMap and Reduce functions, implemented by AppAnalyzer, which is deployed at Hadoopnodes, and managed byManager, and has the generated results stored into a distributeddatabase.

Figure 3.2 Architecture for network traffic analysis using MapReduce

The communication between components was characterized as blocking and non-blocking; blocking communication was adopted in cases that require high consistency,and non blocking communication was adopted in cases where it is possible to use eventualconsistency to obtain better response time and scalability.

24

3.2. ARCHITECTURE

AppAnalyzer is composed of Mappers and Reducers for specific application protocolsand indicators. AppAnalyzer extends AppParser, which provides protocol parsers totransform network traffic into programmable objects, providing a high level abstractionto handle application messages from network traffic.

Manager provides functionalities for users to create monitoring plans with specifica-tion of places, time and amount of data to be captured. The amount of data to be processedand the number of Hadoop nodes available for processing are important factors to obtainan optimal completion time of MapReduce jobs and to generate fresh information forfaster reaction to production problems of the monitored distributed system. Thus, afternetwork traffic is captured and the PCAP files are stored into HDFS,Manager permitsthe selection of the number of files to be processed, and then schedules a MapReduce jobfor this processing. After each MapReduce job execution,Manager is also responsiblefor storing the generated results into a distributed database.

We adopted a distributed database with eventual consistency and high availability,based on Amazon’s Dynamo (DeCandia et al., 2007), and implemented by ApacheCassandra2, to store the indicator results generated by the AppAnalyzer. With the eventualconsistency, we expect gains with fast writes and reads operations, in order to reduce theblocking time of these operations.

AppAnalyzer provides Map and Reduce functions to be used for evaluating specificprotocols and desired indicators. Each Map function receives as input a path of a PCAPfile stored into HDFS; this path is defined by the data locality control of the Hadoop,which tries to delegate each task to nodes that have a local replica of the data or that arenear a replica. Then, the file is opened and each network packet is processed, to remountmessages and flows, and to extract the desired indicators.

During the data processing, the indicators are extracted from application messagesand saved in a SortedMapWritable object, which is ordered by its timestamp. Sort-edMapWritable is a sorted collection of values which will be used by Reduce functionsto summarize each evaluated indicator. In our approach, each evaluated indicator isextracted and saved into an individual result file of Hadoop, which is stored into HDFS.

MapReduce usually splits blocks in records to be used as input for Map functions, butwe adopt whole files as input for Map tasks, to be able to perform DPI and reassembleapplication messages that had their content divided into some TCP packets, due TCPsegmentation or due an implementation decision of the evaluated application. If anapplication message is less than the maximum segment size (MSS), one TCP packet

2http://cassandra.apache.org/

25

3.2. ARCHITECTURE

can transport one or more application message, but if an application message is greaterthan the MSS, the message is split into some TCP packets, according with the TCPsegmentation. Thus, it is necessary to evaluate the full content of some TCP segments torecognize application messages and their protocols.

If application messages have its packets spread into two or more blocks, it is possibleto generate intermediate data of this unevaluated messages by the Map function, groupingeach message by flow and its individual identification, and use the Reduce function toreassembly the message and evaluate it.

To evaluate the effectiveness of our approach, we developed a pilot project to extractapplication indicators from a JXTA-based distributed application traffic, this JXTA-baseddistributed application implements a distributed backup system, based on JXTA Socket.To analyse JXTA-based network traffic, the JNetPCAP-JXTA (Vieira, 2012b) was devel-oped, which parses network traffic into Java JXTA messages, and the JXTAPerfMapperand JXTAPerfReducer, which extract application indicators from JXTA Socket communi-cation layer through Map and Reduce functions.

JNetPCAP-JXTA is writen in the Java language and provides methods to convert bytearrays into Java JXTA messages, using an extension of the JXTA default library for Java,known as JXSE3. With JNetPCAP-JXTA, we are able to parse all kinds of messagesdefined by the JXTA specification. JNetPCAP-JXTA relies on the JNetPCAP library tosupport the instantiation and inspection of LibPCAP packets. JNetPCAP was adopted dueto its performance to iterate over packets, to the large quantity of functionalities providedto handle packet traces and due to the recent update activities for this library.

The JXTAPerfMapper implements a Map function that receives as input a path ofa PCAP file stored into the HDFS; then the content of the specified file is processed toextract the number of JXTA connection requests and number of JXTA message arrivals toa server peer, and to evaluate the round-trip time of each piece of content transmitted overa JXTA Socket. If a JXTA message is greater than the TCP PDU size, the message is splitinto some TCP segments, due to the TCP segmentation. Additionally, in a JXTA networktraffic, one TCP packet can transport one or more JXTA message, due to the bufferwindow size used by the Java JXTA Socket implementation to segment its messages.

Because of the possibility of transporting more than one JXTA message per packetand the TCP segmentation, it is necessary to reassemble more than one packet and thefull content of each TCP segment to recognize all possible JXTA messages, instead ofevaluating only a message header or signature of individual packets, as is commonly done

3http://jxse.kenai.com/

26

3.2. ARCHITECTURE

in DPI or by widely used traffic analysis tools, such as Wireshark4, which is unable torecognize all JXTA messages in a captured network traffic, due to its approach in whichit does not identify when two or more JXTA messages are transported into the same TCPpacket.

JXTAPerfMapper implements a DPI algorithm to recognize, sort and reassemble TCPsegments into JXTA messages, which is shown in Algorithm 1.

Algorithm 1 JXTAPerfMapperfor all tcpPacket do

if isJXTA isWaitingForPendings thenparsePacket(tcpPacket)

end ifend for

function PARSEPACKET(tcpPacket)parseMessageif isMessageParsed then

upddateSavedFlowsif hasRemain then

parsePacket(remainPacket)end if

elsesavePendingMessagelookForMoreMessages

end ifend function

For each TCP packet of the PCAP file, it is verified if it is a JXTA message or if itis part of a JXTA message that was not fully parsed and is waiting for its complement;if one of these conditions is true, then a parse attempt is made, using JNetPCAP-JXTAfunctionalities, up to the full verification of the packet content. As a TCP packet maycontain one or more JXTA messages, if a message is fully parsed, then another parseattempt is done with the content not used by the previous parse. If the content is a JXTAmessage and the parse attempt is not successful, then its TCP content is stored with itsTCP flow identification as a key, and all the next TCP packets that match with the flowidentification will be sorted and used to attempt to mount a new JXTA message, until theparser is successful.

With these characteristics, inspection of JXTA messages and extract applicationindicators requires more effort than other cases of DPI. For this kind of traffic analysis

4http://www.wireshark.org/

27

3.2. ARCHITECTURE

memory requirements become even larger, because it needs to take into account not onlythe state of the transport session, but also the state of each application layer session.Also processing power is the highest because the protocol conformance analysis requiresprocessing the entire application data (Risso et al., 2008).

As previously shown in Figure 3.2, the AppAnalyzer is composed by Map and Reducefunctions, respectively JXTAPerfMapper and JXTAPerfReducer, to extract performanceindicators from JXTA Socket communication layer, which is a JXTA communicationmechanism that implements a reliable message exchange and obtains better throughputbetween the communication layers provided by the Java JXTA implementation.

The JXTA Socket messages are transported by the TCP protocol, but it also imple-ments its own control for data delivery, retransmission and acknowledgements. Eachmessage of a JXTA Socket is part of a Pipe that represents a connection establishedbetween the sender and the receiver. In a JXTA Socket communication, two Pipes areestablished, one from sender to receiver and the other from receiver to sender, in which aretransported content messages and acknowledgement messages, respectively. To evaluateand extract performance indicators from a JXTA Socket, the messages must be sorted,grouped and linked with their respective Pipes of content and acknowledgement.

The content transmitted into a JXTA Socket is split into byte array blocks andstored in a reliability message that is sent to the destination and it expects to receivean acknowledgement message of its arrival. The time between the message deliveryand when the acknowledgement is sent back is called round-trip time (RTT); this mayvary according to the system load and may indicate a possible overload of a peer. In aJava JXTA implementation, each block received or to be sent is queued by the JXTAimplementation, until the system is ready to process a new block. This waiting time tohandle messages can impact the response time of the system, increasing the messageRTT.

The JXTAPerfMapper and JXTAPerfReducer evaluate the RTT of each content blocktransmitted over a JXTA Socket, and also extract information about the number ofconnection requests and message arrivals per time. Each Map function evaluates thepacket trace to mount JXTA messages, Pipes and Sockets. The parsed JXTA messagesare sorted by their sequence number and grouped by their Pipe identification, to composethe Pipes of a JXTA Socket. As soon as the messages are sorted and grouped, the RTT isobtained, its value is associated with its key and written as an output of the Map function.

The reduce function defined by JXTAPerfReducer receives as input a key and a col-lection of values, which are the evaluated indicator and its collected values, respectively,

28

3.3. EVALUATION

and then generates individual files with the results of each indicator evaluated.The requirements to improve these Map and Reduce functions to address others

application indicators, such as throughput or number of retransmissions, are that eachindicator must be represented by an intermediate key, which is used by MapReduce forgrouping and sorting, and that collected values must be associated with its key.

3.3 Evaluation

In this section we perform an experiment to evaluate the effectiveness of MapReduceto express DPI algorithms and its completion time scalability for profiling distributedapplications through DPI: then our scope was limited to evaluate the AppAnalyzer,AppParser and the Hadoop environment, from the architecture presented before.

3.3.1 Evaluation Methodology

For this experimental evaluation, we adopted a methodology based on aspects of GQM(Goal-Question-Metric) template (Basili et al., 1994) and on the systematic approach toperformance evaluation defined by Jain (1991).

Two questions were defined to achieve our defined goal, and these questions are:

• Q1: Can MapReduce express DPI algorithms and extracts application indicatorsfrom network traffic of distributed applications?

• Q2: Is the completion time of MapReduce for DPI proportionally scalable with theaddition of worker nodes?

To answer these questions, the metrics described in Table 3.1 were evaluated, whichshows the number of indicators extracted from distributed application traffic and thebehaviour followed by the completion time scalability obtained per variation of numberof worker nodes in a MapReduce cluster. The completion time scalability evaluates howis the decreasing of completion time obtained with node addition into a MapReducecluster, for processing a defined input dataset.

This experimental evaluation adopts the factors and levels described in Table 3.2,which represents the number of worker nodes of a MapReduce cluster and the input sizeused in MapReduce jobs. These factors make possible to evaluate the scalability behaviorof MapReduce over variations in the selected factors.

29

3.3. EVALUATION

Table 3.1 Metrics to evaluate MapReduce effectiveness and completion time scalability for DPIof a JXTA-based network trafficMetrics Description QuestionM1: Number of Indicators Number of application indicators extracted

from a distributed application traffic.Q1

M2: Proportional Scalability Verify if the completion time decreasesproportionally to the number of workernodes.

Q2

Table 3.2 Factors and levels to evaluate the defined metricsFactors LevelsNumber of worker nodes 3 up to 19Input Size 16GB and 34GB

Our testing hypotheses are defined in Table 3.3 and 3.4, that describe the null hypoth-esis and alternative hypothesis for each previously defined question. Table 3.3 describesour hypotheses and Table 3.4 presents the notation used to evaluate our hypotheses.

Table 3.3 Hypotheses to evaluate the defined metricsAlternative Hypothesis Null Hypothesis QuestionH1num.indct : It is possible to useMapReduce for extracting applica-tion indicators from network traffic.

H0num.indct . It is not possible to useMapReduce for extracting applica-tions indicators from network traffic.

Q1

H1scale.prop: The completion time ofMapReduce for DPI, does not scaleproportionally to node addition.

H0scale.prop. The completion time ofMapReduce for DPI, scales propor-tionally to node addition.

Q2

The hypotheses H1num.indct and H0num.indct were defined to evaluate if MapReducecan be used to extract applications indicators from network traffic, for this evaluationit was analysed the number of indicators extracted from a JXTA-base network traffic,represented by µnum.indct .

It is common to see statements saying that MapReduce scalability is linear, butachieving linear scalability in distributed systems is a difficult task. Linear scalabilityhappens when a parallel system does not loose performance while scaling (Gunther,2006), then a node addition implies proportional performance gain in completion time orprocessing capacity. We defined the hypotheses H1scale.prop and H0scale.prop to evaluatethe completion time scalability behavior of MapReduce, testing if it provides proportionalcompletion time scalability. In these hypotheses, t represents the completion time forexecuting a Job j, s represents the cluster size and n represents the evaluated multiplication

30

3.3. EVALUATION

Table 3.4 Hypothesis notationHypothesis Notation QuestionH1num.indct µnum.indct > 0 Q1H0num.indct µnum.indct <= 0 Q1H1scale.prop µscale.prop = ∀n ∈ N∗,sn = s ·n⇒ tn �= t

n Q2H0scale.prop µscale.prop = ∀n ∈ N∗,sn = s ·n⇒ tn = t

n Q2

factor, which is the increasing factor for the cluster size evaluated. H0scale.prop states that,evaluating a specific MapReduce Job and input data, for all n being natural and biggerthan zero, a new cluster size defined by a previous cluster size multiplied by the factor n,implies into the reduction of the previous job time t according to the division factor n,resulting in the time tn obtained through the division of the previous time t by the factor n.

3.3.2 Experiment Setup

To evaluate theMapReduce effectiveness for application traffic analysis and its completiontime scalability, we performed two sets of experiments, grouped by the input size analysed,with variation in the number of worker nodes.

Was used as input for MapReduce jobs, network traffic captured from a JXTA-baseddistributed backup system, which uses the JXTA Socket communication layer for datatransfer between peers. The network traffic was captured from an environment composedof six peers, where one peer server receives datafrom five concurrent client peers, to bestored and replicated to another peers. During the capturing traffic, one server peer createsa JXTA Socket Server to accept JXTA Socket connections and receive data through anestablished connection.

For each data backup, one client peer establishes a connection with a server peer andsends messages with the content to be stored; if the content to be stored is bigger than theJXTA message maximum size, its content will be transferred through two or more JXTAmessages. For our experiment, we adopted the backup of files with content size randomlydefined, with values between 64KB and 256KB.

The network traffic captured was saved into datasets of 16GB and 34GB, split in 35and 79 files of 64MB, respectively, and stored into HDFS, to be processed as describedin Section 3.2, in order to extract, from the JXTA Socket communication layer, theseselected indicators: round-trip time, number of connection requests per time and numberof messages received by one peer server per time.

For each experiment set the algorithm 1 was executed, implemented by JXTAPerfMap-

31

3.4. RESULTS

per and JXTAPerfReducer, and was measured the completion time and processing capacityfor profiling a JXTA-based distributed application through DPI, over different number ofworker nodes. Each experiment was executed 30 times to obtain reliable values (Chenet al., 2011), within confidence interval of 95% and a maximum error ratio of 5%. Theexperiment was performed using virtual machines of the Amazon EC2, with nodes run-ning Linux kernel 3.0.0-16, Hadoop version 0.20.203, block size of 64MB and with thedata replicated 3 times over the HDFS. All used virtual machines were composed of 2virtual cores, 2.5 EC2 Compute Units and 1.7GB of RAM.

3.4 Results

From the JXTA traffic analysed, we extracted three indicators, the number of JXTAconnection requests per time, the number of JXTA messages received per time, and theround-trip time of JXTA messages, that is defined by the time between content messagearrival from a client peer, and the JXTA acknowledgement sent back from a server peer.The extracted indicators are shown in Figure 3.3.

Figure 3.3 JXTA Socket trace analysis

Figure 3.3 shows the extracted indicators, exhibiting the measured indicators from the

32

3.4. RESULTS

JXTA Socket communication layer and its behaviour for concurrent data transferring, ofa server peer receiving JXTA Socket connection request and messages from concurrentclient peers of a distributed backup system.

The three indicators extracted from the network traffic of a JXTA-based distributedapplication, using MapReduce to perform DPI algorithm and extract desired indicators,represents important indicators to evaluate a JXTA-based application (Halepovic andDeters, 2005). With these extracted indicators it is possible to evaluate a distributedsystem, providing a better understanding of the behaviour of a JXTA-based distributedapplication. Through the extracted information it is possible to evaluate importantmetrics, such as the load distribution, response time and the negative impact caused bythe increasing of number of messages received by a peer.

Using MapReduce to perform a DPI algorithm, was possible to extract the threeapplication indicators from network traffic, then was obtained µnum.indct = 3, what rejectsthe null hypothesis H0num.indct , that states µnum.indct <= 0, and confirms the alternativehypothesis H1num.indct , confirming that µnum.indct > 0.

Figures 3.4(a) and 3.4(b) illustrate how the addition of worker nodes into an Hadoopcluster reduces the mean completion time and how the scalability of completion time isfor profiling 16 GB and 34 GB of network traffic trace.

100

150

200

250

300

350

2 3 4 5 6 7 8 9 10 11

Tim

e(s

)

Nodes

Completion Time

(a): Scalability to process 16 GB

100

150

200

250

300

350

400

450

500

4 6 8 10 12 14 16 18 20

Tim

e(s

)

Nodes

Completion Time

(b): Scalability to process 34 GB

Figure 3.4 Completion time scalability of MapReduce for DPI

In both graphics, the behaviour of the completion time scalability is similar, notfollowing a linear function and with more significant scalability gains, through nodeaddition, in smaller clusters, and less significant gains with node addition into biggerclusters.

This scalability behaviour highlights the importance of evaluating the relation betweencosts and benefits to nodes additions in a MapReduce cluster, due to the non proportional

33

3.4. RESULTS

gain with node addition in a MapReduce cluster.The Tables 3.5 and 3.6 present respectively the results of the experiment to deeply

inspect 16 GB and 34 GB of network traffic trace, showing the number of Hadoop nodesused for each experiment, the mean completion time in seconds, its margin of error, theprocessing capacity achieved and the relative processing capacity per node in the cluster.

Table 3.5 Completion time to process 16 GB split into 35 filesNodes 3 4 6 8 10Time 322.53 246.03 173.17 151.73 127.17Margin of Error 0.54 0.67 0.56 1.55 1.11MB/s 50.80 66.59 94.61 107.98 128.84(MB/s)/node 16.93 16.65 15.77 13.50 12.88

Table 3.6 Completion time to process 34 GB split into 79 filesNodes 4 8 12 16 19Time 464.33 260.60 189.07 167.13 134.47Margin of Error 0.32 0.76 1.18 0.81 1.53MB/s 74.98 133.60 184.14 208.32 258.91(MB/s)/node 18.75 16.70 15.35 13.02 13.63

In our experiments, we achieved a maximum mean processing capacity of 258.91 MBper second, in a cluster with 19 worker nodes, processing 34 GB. For a cluster with 4nodes we achieved a mean processing capacity of 66.59 MB/s and 74.98 MB/s to processrespectively 16 GB and 34 GB of network traffic trace, which indicates that processingcapacity may vary in function of the amount of data processed and the number of filesused as input data, and indicates that the input size is an important factor to be analysedby MapReduce performance evaluations.

The results show that for the evaluated scenario and application, the completion timedecreases with the increment of number of nodes in the cluster, but not proportionallyto node addition and not in a linear function, as can be observed in Figures 3.4(a) and3.4(b). Observing Figures 3.4(a) and 3.4(b) is possible to see that the completion timedoes not decreases linearly. Also, Tables 3.5 and 3.6 show values that confirms the nonproportional completion time scalability. For example the Table 3.5 shows a cluster with4 nodes processing 16 GB that was scaled out to 8 nodes, then was obtained an incrementof 2 times in number of nodes, but we achieved a gain of 1.62 times in completion time.

To evaluate our stated hypotheses H1scale.prop and H0scale.prop based on this example,we have the measured s2 = 8 and the calculated s · n = sn defined by 4 · 2 = 8 = s2,

34

3.5. DISCUSSION

what confirms sn = s · n. We also have the measured t2 = 151.73 and the calculated tn

defined by 246.032 = 123.01 = t2, what rejects tn = t

n and confirms tn �= tn . Therefore,

with the measured results was rejected the null hypothesis H0scale.prop and confirmed thealternative hypothesis H1scale.prop, which states that the completion time of MapReducefor DPI, does not scale proportionally to node addition.

3.5 Discussion

In this section, we discuss the measured results and evaluate its meaning, restrictions andopportunities. We also discuss possible threats to validity of our experimental results.

3.5.1 Results Discussion

Distributed systems analysis, detection of root causes and error reproduction are chal-lenges that motivates efforts to develop less intrusive mechanisms for profiling andmonitoring distributed applications at runtime. Network traffic analysis is one option toevaluate distributed systems, although there are limitations on capacity to process a largeamount of network traffic in a short time, and on completion time scalability to processnetwork traffic where there is variation of resource demand.

According to the evaluated results, using MapReduce for profiling a network trafficfrom a JXTA-based distributed backup system, through DPI, it is important to analyse thepossible gains with node addition into a MapReduce cluster, because the node additionprovides different gains according to the cluster size and input size. For example, Table3.6 shows that the addition of 4 nodes into a cluster with 12 nodes, produces a reductionof 11% in completion time and an improvement of 13% in processing capacity, while theaddition of the same amount of nodes (4 nodes) into a cluster with 4 nodes produces areduction of 43% in completion time and an improvement of 78% in processing capacity.

The scalability behaviour of MapReduce for DPI highlights the importance of evalu-ating the relation between costs and benefits to node additions into a MapReduce cluster,because the gains obtained with node addition are related to the actual and future clustersize and the input size to be processed.

The growing of the number of nodes in the cluster increases costs due to greatercluster management, data replication, tasks allocation to available nodes and due costswith the management of failures. Also, with the cluster growing, the cost is increasedwith merging and sorting of the data processed by Map tasks (Jiang et al., 2010), that canbe spread into a bigger number of nodes.

35

3.5. DISCUSSION

In smaller clusters, the probability of a node having a replica of the input data, isgreater than in bigger clusters adopting the same replication factor (Zaharia et al., 2010).In bigger clusters there are more options of nodes for delegate a task execution, but thenumber of data replication limits the benefits of data locality to the number of nodes thatstore a replica of the data. This increases the cost to schedule tasks and to distribute tasksin the cluster, and also increases costs with data transfer over the network.

The kind of workload submitted to be processed by MapReduce impacts in thebehaviour and performance of MapReduce (Tan et al., 2012; Groot, 2012), requiringspecific configuration to obtain an optimal performance. Although studies have beendone in order to understand, analyse and improve workload management decisionsin MapReduce (Lu et al., 2012; Groot, 2012), there is no evaluation to characterizethe MapReduce behaviour or to identify its optimal configuration to achieve the bestperformance for packet level analysis and DPI. Thus, it is necessary deeply understandthe behaviour of MapReduce to process network traces and what optimizations can bedone to better explore the potential provided by MapReduce for packet level analysis andDPI

3.5.2 Possible Threats to Validity

Due budget and time restrictions, our experiments were performed with small cluster sizeand small input size, if compared with benchmarks that evaluate the MapReduce perfor-mance and its scalability (Dean and Ghemawat, 2008). However, relevant performanceevaluations and reports of real MapReduce production traces shows that the majority ofthe MapReduce jobs are small and executed into a small number of nodes (Zaharia et al.,2008; Wang et al., 2009; Lin et al., 2010; Zaharia et al., 2010; Kavulya et al., 2010; Chenet al., 2011; Guo et al., 2012).

Although MapReduce has been designed to handle big data, the use of input data inorder of gigabytes has been reported by realistic production traces (Chen et al., 2011),and this input size has been used in relevant MapReduce performance analysis (Zahariaet al., 2008; Wang et al., 2009; Lin et al., 2010).

Improvements in MapReduce performance and proposed schedulers has focusedinto problems related to small jobs, for example Facebook’s fairness scheduler aims toprovide fast response time for small jobs (Zaharia et al., 2010; Guo et al., 2012). Fairscheduler attempts to guarantee service levels for production jobs by maintaining jobpools composed by a small number of nodes than the total nodes of a data center, tomaintain a minimum share and dividing excess capacity among all jobs or pools (Zaharia

36


et al., 2010).According to Zaharia et al. (2010), 78% of Facebook’s MapReduce jobs have up to

60 Map tasks. Our evaluated datasets were composed by 35 and 79 files, what impliesinto the same and respective numbers of Map tasks, due to our approach evaluates anentire block per Map task.

3.6 Chapter Summary

In this chapter, we presented an approach for profiling application traffic using MapRe-duce, and evaluated its effectiveness for profiling application through DPI and its comple-tion time scalability in a cloud computing environment.

We proposed a solution based on MapReduce, for deep inspection of distributedapplications traffic, in order to evaluate the behaviour of distributed systems at runtime,using commodity hardware, in a low intrusive way, through a scalable and fault tolerantapproach based on Hadoop, an open source implementation of MapReduce.

MapReduce was used to implement a DPI algorithm to extract application indicatorsfrom a JXTA-based traffic of a distributed backup system. Was adopted an splittingapproach without the block division into records, was used a network trace split into fileswith maximum size lesser than the HDFS block size, to avoid the cost and complexity ofproviding to the HDFS a algorithm for splitting the network trace into blocks, and also touse a whole block as input for Map functions, in order to be able to reassemble two ormore packets and reassemble JXTA messages from packets of network traces, per Mapfunction.

We evaluated the effectiveness of MapReduce for a DPI algorithm and its completiontime scalability, over different sizes of network traffic used as input, and different clustersize. We showed that the MapReduce programming model can express algorithms forDPI and extracts application indicators from application network traffic, using virtualmachines of a cloud computing provider, for DPI of large amounts of network traffic.We also evaluated its completion time scalability, showing the scalability behaviour, theprocessing capacity achieved, and the influence of the number of nodes and the data inputsize on the capacity processing for DPI.

It was shown that MapReduce completion time scalability for DPI does not follow alinear function, with more significant scalability gains, through the addition of nodes, insmall clusters, and less significant gains in bigger clusters.

According to the results, input size and cluster size generate significant impact in

37


processing capacity and completion time of MapReduce jobs for DPI. This highlightsthe importance of evaluating the best input size and cluster size to obtain an optimalperformance in MapReduce Jobs, but also indicates the need for more evaluations aboutthe influence of others important factors on MapReduce performance, in order to providebetter configuration, selection of input size and machine allocation into a cluster, and toprovide valuable information for performance tuning and predictions.

38

4Evaluating MapReduce for Network

Traffic Analysis

All difficult things have their origin in that which is easy, and great

things in that which is small.

—LAO TZU

The use of MapReduce for distributed data processing has been growing and achiev-ing benefits with its application for different workloads. MapReduce can be used fordistributed traffic analysis, although network traffic traces present characteristics whichare not similar to the data type commonly processed through MapReduce, which ingeneral is divisible and text-like data, while network traces are binary and may presentrestrictions about splitting, when processed through distributed approaches.

Due to the lack of evaluation of MapReduce for traffic analysis and the peculiarity ofthis kind of data, this chapter deeply evaluates the performance of MapReduce in packetlevel analysis and DPI of distributed application traffic, evaluating its scalability, speed-upand the behaviour followed by MapReduce phases. The experiments provide evidencesfor the predominant phases in this kind of MapReduce job, and show the impact of inputsize, block size and number of nodes, on completion time and scalability.

This chapter is organized as follows. We first describe the motivation for a MapReduceperformance evaluation for network traffic analysis in Section 4.1. Then we present theevaluation plan and methodology adopted in Section 4.2, and the results are presented in

39

4.1. MOTIVATION

Section 4.3. Section 4.4 discusses the results and Section 4.5 summarizes the chapter.

4.1 Motivation

It is possible to measure, evaluate and diagnose distributed applications through theevaluation of information from communication protocols, flows, throughput, and loaddistribution (Mi et al., 2012; Nagaraj et al., 2012; Sambasivan et al., 2011; Aguileraet al., 2003; Yu et al., 2011). This information can be collected through network trafficanalysis, but to retrieve application information from network traces, it is necessary torecognize the application protocol and deeply inspect the traffic to retrieve details aboutits behaviour, session and states.

MapReduce can be used for offline evaluation of distributed applications, analysingapplication traffic inside a data center, through packet level analysis (Lee et al., 2011),evaluating each packet individually, and through DPI (Vieira et al., 2012b,a), adopting adifferent approach for data splitting, where a whole block is processed without divisioninto individual packets, due to the necessity to reassemble two or more packages toretrieve information of the application layer, in order to evaluate application messagesand protocols.

The kind of workload submitted for processing by MapReduce impacts on the be-haviour and performance of MapReduce (Tan et al., 2012; Groot, 2012), requiring specificconfiguration to obtain an optimal performance. Information about the occupation ofMapReduce phases, the processing characteristics (if the job is I/O or CPU bound),and the mean duration of Map and Reduce tasks, can be used to optimize parameterconfigurations, and to improve resource allocation and task scheduling.

The main evaluations of MapReduce are in text processing (Zaharia et al., 2008;Chen et al., 2011; Jiang et al., 2010; Wang et al., 2009), where the input data are splitinto blocks and into records, to be processed by parallel and independent Map functions.For distributed processing of network traffic traces, which are usually binary, the datasplitting into packets is a concern and, in some cases, may require data without splitting,especially when packet reassembly is required to extract application information fromthe application layer.

Although work has been done to understand, analyse and improve workload manage-ment decisions in MapReduce (Lu et al., 2012; Groot, 2012), there is no evaluation tocharacterize the MapReduce behaviour or to identify its optimal configuration to achievethe best performance for packet level analysis and DPI.

Due to the lack of evaluation of MapReduce for traffic analysis and the peculiarity of

40

4.2. EVALUATION

this kind of data, it is necessary to understand the behaviour of MapReduce to processnetwork traces and to understand what optimizations can be done to better explore thepotential provided by MapReduce for packet level analysis and DPI.

This chapter evaluates MapReduce performance for network packet level analysisand DPI using Hadoop, characterizing the behaviour followed by MapReduce phases,scalability and speed-up, over variation of input, block and cluster sizes. The maincontributions of this chapter are:

1. Characterization of MapReduce phases behaviour for packet level analysis andDPI;

2. Description of scalability behaviour and its relation with important MapReducefactors;

3. Identification of the performance provided by the block sizes adopted for differentcluster size;

4. Description of speed-up obtained for DPI.

4.2 Evaluation

The goal of this evaluation is to characterize the behaviour of MapReduce phases, itsscalability characteristics over node addition and the speed-up achieved with MapReducefor packet level analysis and DPI. Thus, we performed a performance measurement andevaluation of MapReduce jobs that execute packet level analysis and DPI algorithms.

To evaluate MapReduce for DPI, the Algorithm 1 implemented by JXTAPerfMapperand JXTAPerfReducer was used, and applied to new factors and levels. To evaluateMapReduce for packet level analysis, a port counter algorithm developed by Lee et al.(2011) was used, which divides a block into packets and processes each packet indi-vidually to count the number of occurrence of TCP and UDP port numbers. The samealgorithm was evaluated using the splitting approach that processes a whole block perMap function, without the division of a block into records or packets. A comparison wasdone between these two approaches for packet level analysis.

4.2.1 Evaluation Methodology

For this evaluation, we adopted a methodology based on the systematic approach toperformance evaluation defined by Jain (1991), which consists of the definition of the

41

4.2. EVALUATION

goal, metrics, factors and levels for a performance study.The goal of this evaluation is to characterize the behaviour of MapReduce phases,

its scalability through node addition and the speed-up achieved with MapReduce forpacket level analysis and DPI, to understand the impact of each factor on MapReduceperformance for this kind of input data, in order to be able to configure the MapReduceand obtain an optimal performance over the evaluated factors.

Table 4.1 Metrics for evaluating MapReduce for DPI and packet level analysisMetrics DescriptionCompletion Time Completion time of MapReduce jobsPhases Time Time consumed by each MapReduce phase in total completion

time of MapReduce jobsPhases Occupation Relative time consumed by each MapReduce phase in total com-

pletion time of MapReduce jobsScalability Processing capacity increasing obtained with node addition in a

MapReduce clusterSpeed-up Improvement in completion time against the same algorithm im-

plemented without distributed processing

Table 4.1 describes the metrics evaluated, that is the completion time of MapReducejobs, the relative and absolute time of each MapReduce phases from the total job time,the processing capacity scalability, and the speed-up against non-distributed processing.

The experiments adopt the factors and level described in Table 4.2. The selectedfactors were chosen due to its importance for MapReduce performance evaluations andits adoption by relevant previous researches (Jiang et al., 2010; Chen et al., 2011; Shaferet al., 2010; Wang et al., 2009).

Table 4.2 Factors and LevelsFactors LevelsNumber of Worker Nodes 2 up to 29Block Size 32MB, 64MB and 128MBInput Size 90Gb and 30Gb

Hadoop logs are a valuable information source about its environment and executionjobs; important MapReduce indicators and information about jobs, tasks, attempts,failures and topology are logged by Hadoop during its execution. The collected data toperform this performance evaluation was extracted from Hadoop logs.

To extract information from Hadoop logs and to evaluate the selected metricis, wedeveloped the Hadoop-Analyzer (Vieira, 2013), an open source and publicly available

42

4.2. EVALUATION

tool to extract and evaluate MapReduce indicators, such as job completion time andMapReduce phases distribution, from logs generated by Hadoop from its job executions.With Hadoop-Analyzer is possible to generate graphs with the extracted indicators andthereby evaluate the desired metrics.

Hadoop-Analyzer relies on Rumen (2012) to extract raw data from Hadoop logsand generate structured information, which is processed and shown in graphs generatedthrough R (Eddelbuettel, 2012) and Gnuplot (Janert, 2010), such as the results presentedin Section 4.3

4.2.2 Experiment Setup

Network traffic traces of distributed applications were captured to be used as input forMapReduce jobs of our experiments; these traces were divided into files with size definedby the block size adopted in each experiment, and then the files were stored into HDFS,following the process described in the previous chapter. The packets were captured usingTcpdump and were split into files with sizes of 32MB, 64MB and 128MB.

For packet level analysis and DPI evaluation, two sizes of datasets were capturedfrom network traffic transferred between some nodes of distributed systems. One datasetwas 30Gb of network traffic, with data divided in 30 files of 128MB, 60 files of 64MBand 120 files of 32MB. The other dataset was 90Gb of network traffic, split in 90 files of128MB, 180 files of 64MB and 360 files of 32MB.

For the experiments of DPI through MapReduce, we used network traffic capturedfrom the same JXTA-based application described in Section 3.3.2, but with differentsizes of traces and files. To evaluate MapReduce for packet level analysis, we processednetwork traffic captured from data transferred between 5 clients and one server of a datastorage service provided through the Internet, known as Dropbox1.

To evaluate MapReduce for packet level analysis and DPI, one driver was developedfor each case of network traffic analysis, with one version using MapReduce and anotherwithout it.

CountUpDriver implements packet level analysis for a port counter of network traces,which records how many times a port appears in TCP or UDP packets; its implementationis based on processing a whole block as input for Map functions, without splitting and withblock size defined by the block size of HDFS. Furthermore a port counter implementedwith P3 was evaluated; this implementation is a version of the tool presented by Lee et al.

1http://www.dropbox.com/

43

4.2. EVALUATION

(2011), which adopts an approach that divides a block into packets and processes eachpacket individually, without dependent information between packets.

JxtaSocketPerfDriver implements DPI to extract, from a JXTA (Duigou, 2003) net-work traffic, the round-trip time of JXTA messages, the number of connection requisitionsper time and the number of JXTA Socket messages from JXTA clients and a JXTA Socketserver. JxtaSocketPerfDriver uses whole files as input for each Map function, with sizedefined by the HDFS block size, in order to reassemble JXTA messages with its contentdivided into many TCP packets.

One TCP packet can transport one or more JXTA messages per time, which makes itnecessary to evaluate the full content of TCP segments to recognize all possible JXTAmessages, instead of to evaluate only a message header or signature. The round-trip timeof JXTA messages is calculated from the time between a client peer sending a JXTAmessage and receiving the JXTA message arrival confirmation. To evaluate the round-triptime it is necessary to keep the information of requests and which responses correspond toeach request, and thus, it is necessary to analyse several packages to retrieve and evaluateinformation of the application behaviour and its states.

To analyse the speed-up provided by MapReduce against a single machine execution,two drivers were developed that use the same dataset and implement the same algorithmsimplemented by CountUpDriver and JxtaSocketPerfDriver, but without distributed pro-cessing. These drivers are respectively CountUpMono and JxtaSocketPerfMono.

The source code of all implemented drivers and other implementations to support theuse of MapReduce for network traffic analysis, are open source and publicly available atVieira (2012a).

The experiments were performed on a 30-node Hadoop-1.0.3 cluster composed ofnodes with four 3.2GHz cores, 8 GB RAM and 260GB of available hard disk space,running Linux kernel 3.2.0-29. Hadoop was used as our MapReduce implementation,and configured to permit a maximum of 4 Map and 1 Reduce task per node; also, wedefined the value -Xmx1500m as child option of JVM and 400 as io.Sort.mb value.

For drivers CountUpDriver and JxtaSocketPerfDriver, the number of Reducerswas defined as a function of the number of slots of Reducers per node, defined bynumReducers= (0.95)(numNodes)(maxReducersPerNode) (Kavulya et al., 2010). Thedriver implemented with P3 (Lee et al., 2011) adopts a fixed number of Reducers, definedas 10 by the available version of P3. Each experiment was executed 20 times to obtainreliable values (Chen et al., 2011), within confidence interval of 95% and a maximumerror ratio of 5%.

44

4.3. RESULTS

4.3 Results

Two dataset sizes of network traffic were used during the experiments, with 30Gb and90Gb. Each dataset was processed by MapReduce jobs that implement packet levelanalysis and DPI, in Hadoop clusters with variation in number of worker nodes between2 and 29, and block size of 32MB, 64MB and 128MB.

Each dataset was processed by algorithms implemented through MapReduce andwithout distributed processing, to evaluate the speed-up achieved. The Table 4.3 showsthe execution times obtained by the non-distributed processing, implemented and exe-cuted through JxtaSocketPerfMono and CountUpMono, using a single machine, with theresource configuration described in Subsection 4.2.2.

Table 4.3 Non-Distributed Execution Time in secondsBlock JxtaSocketPerfMono CountUpMono

90Gb 30Gb 90Gb 30Gb

32MB 1.745,35 584,92 872,40 86,7164MB 1.755,40 587,02 571,33 91,76128MB 1.765,50 606,50 745,25 94,82

Figure 4.1 shows the completion time and speed-up of the DPI Algorithm 1 to extractindicators from a JXTA-based distributed application. The Completion Time representsthe job time of JxtaSocketPerfDriver and the Speed-up represents gains in execution timeof JxtaSocketPerfDriver against JxtaSocketPerfMono to process 90Gb of network traffic.

0

100

200

300

400

500

600

700

800

900

02 04 06 10 14 18 21 25 29

5

10

15

20

Tim

e(s

)

Sp

ee

d−

up

Nodes

Completion Time − 32MBCompletion Time − 64MB

Completion Time − 128MBSpeed−up − 32MBSpeed−up − 64MB

Speed−up − 128MB

Figure 4.1 DPI Completion Time and Speed-up of MapReduce for 90Gb of a JXTA-applicationnetwork traffic

45

4.3. RESULTS

According to Figure 4.1, JxtaSocketPerfDriver performs better than JxtaSocketPerf-Mono over all factors variation. Initially we observed the best speed-up of 3.70 timeswith 2 nodes and block of 128MB, lastly we observed a maximum speed-up of 16.19times with 29 nodes and block of 64MB. The speed-up achieved with block size of 32MBwas the worst case initially, but its speed-up increased with node addition and becamebetter than blocks with 128MB and near of the speed-up achieved with block of 64MB,for a cluster with 29 nodes.

The completion time scalability behaviour of 32MB block size showed reductionin completion time for all node additions, although cases with block size of 64MB and128MB present no significant reduction in completion time in clusters with more than 25nodes. According to Figure 4.1, the completion time does not reduce linearly with nodeaddition, and the improvement in completion time was less significant when the datasetwas processed by more than 14 nodes, specially for cases that adopted blocks of 64MBand 128MB.

Figure 4.2 shows the processing capacity of MapReduce applied to DPI of 90Gbof a JXTA-based application traffic, over variation of cluster size and block size. Theprocessing capacity was evaluated by the throughput of network traffic processed, andby the relative throughput, defined by the processing capacity achieved per number ofallocated nodes.

200

400

600

800

1000

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

30

40

50

60

70

80

90

100

Th

rou

gh

pu

t (M

bp

s)

Th

rou

gh

pu

t/N

od

es (

Mb

ps)

Nodes

Throughput − 32MBThroughput − 64MB

Throughput − 128MBThroughput/Nodes − 32MBThroughput/Nodes − 64MB

Throughput/Nodes − 128MB

Figure 4.2 DPI Processing Capacity for 90Gb

The processing capacity achieved for DPI of 90Gb using block size of 64MB was159.89 Mbps with 2 worker nodes, increasing up to 869.43 Mbps with 29 worker nodes.For the same case, the relative processing capacity achieved was 79.94 Mbps/node with 2nodes and 29.98 Mbps/node with 29 nodes, showing a decrease of relative processing

46

4.3. RESULTS

capacity with the growth of the MapReduce cluster size.Although the processing capacity increased, the relative processing capacity, defined

by the processing capacity per allocated node, decreased with all node addition. Thisbehaviour indicates that MapReduce presents a reduction of efficiency with the increaseof cluster size (Gunther, 2006), which highlights the importance of evaluation about thecost of node allocation and its benefits for completion time and processing capacity.

Figures 4.1 and 4.2 also show the difference of performance achieved with differentblocks sizes, and its relation to the cluster size. It was observed that blocks of 128MBachieved a higher throughput in cluster sizes up to 14 nodes, and that blocks of 64MBperformed better in clusters bigger than 14 worker nodes.

Figures 4.3(a) and 4.3(b) show the behaviour of MapReduce phases to DPI of 90Gb.

0

100

200

300

400

500

600

700

800

900

02

04

06

10

14

18

21

25

29

02

04

06

10

14

18

21

25

29

02

04

06

10

14

18

21

25

29

Tim

e(s

)

MapMap and Shuffle

ShuffleSort

ReduceSetup

Cleanup

128MB64MB32MB

(a): Phases Time for DPI

0

20

40

60

80

100

02

04

06

10

14

18

21

25

29

02

04

06

10

14

18

21

25

29

02

04

06

10

14

18

21

25

29

% o

f C

om

ple

tio

n T

ime

MapMap and Shuffle

ShuffleSort

ReduceSetup

CleanupOthers

128MB64MB32MB

(b): Phases Distribution for DPI

Figure 4.3 MapReduce Phases Behaviour for DPI of 90Gb

47

4.3. RESULTS

MapReduce execution can be divided into Map, Shuffle, Sort and Reduce phases,although Shuffle tasks can be executed before the conclusion of all Map tasks, therebyMap and Shuffle tasks can overlap. According to the Hadoop default configuration, theoverlapping between Map and Shuffle tasks starts after 5% of Map tasks are concluded;and then Shuffle tasks are started and run until the Map phase ends.

In Figures 4.3(a) and 4.3(b) we showed the overlapping between Map and Shuffletasks as a specific MapReduce phase, represented as "Map and Shuffle" phase. The timeconsumed by Setup and Cleanup tasks was considered too, for a better visualization ofthe execution time division in Hadoop jobs.

Figure 4.3(a) shows the cumulative time of each MapReduce phase in total job time.For DPI, Map time, which is the Map and the "Map and Shuffle" phases, consumes themajor part of a job execution time and it is the phase that exhibits more variation with thenumber of nodes variation, but no significant time reduction is achieved with more than21 nodes and block size of 64MB or 128MB.

The Shuffle time, which happens after all Map tasks are completed, presented lowvariation with node addition. Sort and Reduce phases required relatively low executiontimes and did not appear in some bars of the graph. Setup and Cleanup tasks consumedan almost constant time, independently of cluster size or block size variation.

Figure 4.3(b) shows the percentage of each MapReduce phase in total job completiontime. We also considered an additional phase, called others, which represents the timeconsumed by cluster management tasks, like scheduling and tasks assignment. Thebehaviour followed by phases occupation is similar over all block sizes evaluated, withthe exception of the case where Map time does not decrease with node addition, inclusters using block size of 128MB and with more than 21 nodes.

With cluster size variation, a relative reduction in Map time was observed and arelative increase in the time of Shuffle, Setup and Cleanup phases. During the evaluationof Figure 4.3(a), it was observed that Setup and Cleanup phases consume an almostabsolute constant time, independently of cluster size and block size, and thereby withnode addition and completion time decreasing, the time consumed by Setup and Cleanuptasks became more significant in relation to the total execution time, due to the taotaljob completion time decreasing and the time of Setup and Cleanup tasks remainingalmost the same; therefore, the Setup and Cleanup percentage time increased and becamemore significant over the total job completion time reduction, by node addition into theMapReduce cluster.

According to Figures 4.3(a) and 4.3(b), Map phase is predominant in MapReduce

48

4.3. RESULTS

jobs for DPI, and the reduction of the total job completion time over node addition isrelated to the decreasing of Map phase time. Thus, improvements in Map phase executionfor DPI workloads can produce more significant gains, in order to reduce the total jobcompletion time for DPI.

Figure 4.4 shows the comparison between completion time of CountUpDriver andP3 to packet level analysis of 90Gb of network traffic, over variation of cluster size andblock size.

0

100

200

300

400

500

02 04 06 10 14 21 28

Tim

e(s

)

Nodes

CountUpDriver − 32MBP3 − 32MB



Figure 4.4 Completion time comparison of MapReduce for packet level analysis, evaluating theapproach with and without splitting into packets

P3 achieves better completion time than CountUpDriver over all factors, showingthat a divisible files approach performs better for packet level analysis and that block sizeis a significant factor for both approaches, due to significant impact on completion timecaused by adoption of blocks with different sizes.

With variation in the number of nodes, it was observed that using a block size of128MB was achieved better completion time up to 10 nodes, but that no more improve-ment in completion time was achieved with node addition on clusters with more than 10nodes. Blocks of 32MB and 64MB only present significant completion time difference ina cluster up to 14 nodes; for a cluster bigger than 14 nodes a similar completion time wasachieved for both block sizes adopted, but these are still better completion times than thecompletion time achieved with blocks of 128MB.

Figures 4.5(a) and 4.5(b) show respectively the completion time and speed-up of P3and CountUpDriver against CountUpMono, for packet level analysis, with variation onthe number of nodes and block size. For both cases, the use of a block size of 128MB

49

4.3. RESULTS

provides the best completion time in smaller clusters, up to 10 nodes, but it providesa worse completion time in a cluster with more than 21 nodes. For both evaluations,the speed-up adopting block of 128MB scales up to 10 nodes, but for a bigger cluster aspeed-up gain was not achieved with node addition.

0

50

100

150

200

250

300

350

02 04 06 10 14 21 28 2

4

6

8

10

12

14

16

18

Tim

e(s

)

Sp

ee

d−

up

Nodes




(a): P3 evaluation

0

100

200

300

400

500

02 04 06 10 14 21 28

2

4

6

8

10

12

14

Tim

e(s

)

Sp

ee

d−

up

Nodes




(b): CountUpDriver evaluation

Figure 4.5 CountUp completion time and speed-up of 90Gb

Using blocks of 32MB was achieved improvement in completion time for all nodeaddition, which causes improvement of speed-up for all cluster size, although the use ofthis block size did not present better completion time than others block sizes in any case.

The adoption of 32MB blocks provided better speed-up than other block sizes ina cluster with more than 14 nodes, due to the time consumed by CountUpMono forprocessing of 90Gb divided into 32MB files which was bigger than the time consumedby cases with other block sizes, as shown in Table 4.3.

50

4.3. RESULTS

Figures 4.6(a) and 4.6(b) show the processing capacity of P3 and CountUpDriver toperform packet level analysis of 90Gb of network traffic, over variation of cluster sizeand block size.

500

1000

1500

2000

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

50

100

150

200

250

Th

rou

gh

pu

t (M

bp

s)

Th

rou

gh

pu

t/N

od

es (

Mb

ps)

Nodes




(a): P3 processing capacity

0

200

400

600

800

1000

1200

1400

1600

1800

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

40

60

80

100

120

Th

rou

gh

pu

t (M

bp

s)

Th

rou

gh

pu

t/N

od

es (

Mb

ps)

Nodes




(b): CountUpDriver processing capacity

Figure 4.6 CountUp processing capacity for 90Gb

Using block size of 64MB, P3 achieved throughput of 413.16 Mbps with 2 nodes andthe maximum of 1606.13 Mbps with 28 nodes, while its relative throughput for the sameconfiguration was 206.58 Mbps and 55.38 Mbps. The processing capacity for packetlevel analysis, evaluated for P3 and CountUpDriver, follows the same behaviour showedin Figure 4.2. Additionally, it is possible to observe a convergent decreasing of relativeprocessing capacity for all block sizes evaluated, starting in a cluster size of 14 nodes,where the relative throughput achieved by all block sizes is quite similar.

Figure 4.6(b) shows a relative processing capacity increasing with the addition of

51

4.3. RESULTS

2 nodes into a cluster with 4 nodes. For packet level analysis of 90Gb, MapReduceachieved a better processing capacity efficiency per node using 6 nodes, which provides24 Mappers and 5 Reducers per Map and Reduce waves. With the adopted variation innumber of Reducers, according to cluster size, using 5 Reducers was achieved betterprocessing efficiency and significant reduction on Reduce time, as shown in Figure 4.7(b).

Figures 4.7(a) and 4.7(b) show the cumulative time per phase during a job execution.

0

50

100

150

200

250

300

350

02

04

06

10

14

21

28

02

04

06

10

14

21

28

02

04

06

10

14

21

28

Tim

e(s

)

MapMap and Shuffle

ShuffleSort

ReduceSetup

Cleanup

128MB64MB32MB

(a): MapReduce Phases Times of P3

0

100

200

300

400

500

600

02

04

06

10

14

21

28

02

04

06

10

14

21

28

02

04

06

10

14

21

28

Tim

e(s

)

MapMap and Shuffle

ShuffleSort

ReduceSetup

Cleanup

128MB64MB32MB

(b): MapReduce Phases Times for CountUpDriver

Figure 4.7 MapReduce Phases time of CountUp for 90Gb

The behaviour of MapReduce phases for packet level analysis is similar to thebehaviour observed for DPI; with Map time being predominant, Map and Shuffle timedo not decreasing with node addition in a cluster bigger than a specific size, and Sortand Reduce phases consuming low execution time. The exception is that Shuffle phaseconsumes more time in packet level analysis jobs than in DPI, specially in smaller clusters.

52

4.3. RESULTS

For packet level analysis, the amount of intermediate data generated by Map functionsis bigger than the amount generated through the use of MapReduce for DPI; packetlevel analysis generates an intermediate data for each packet evaluated, but for DPI it isnecessary to evaluate more than one packet to generate intermediate data. Shuffle phaseis responsible for sorting and transferring the Map outputs to the Reducers as inputs; thenthe amount of intermediate data generated by Map tasks and the network transfer cost,will impact on the Shuffle phase time.

Figures 4.8(a) and 4.8(b) show the percentage of each phase on job completion timeof P3 and CountUpDriver, respectively.

0

20

40

60

80

100

02

04

06

10

14

21

28

02

04

06

10

14

21

28

02

04

06

10

14

21

28

% o

f C

om

ple

tio

n T

ime

MapMap and Shuffle

ShuffleSort

ReduceSetup

CleanupOthers

128MB64MB32MB

(a): Phases Distribution for P3

0

20

40

60

80

100

02

04

06

10

14

21

28

02

04

06

10

14

21

28

02

04

06

10

14

21

28

% o

f C

om

ple

tio

n T

ime

MapMap and Shuffle

ShuffleSort

ReduceSetup

CleanupOthers

128MB64MB32MB

(b): Phases Distribution for CountUpDriver

Figure 4.8 MapReduce Phases Distribution for CountUp of 90Gb

As the behaviour observed in Figure 4.3(b) and followed by these cases, Map andShuffle phases consume more relative time than all others phases, over all factors. But, for

53

4.3. RESULTS

packet level analysis, Map phase occupation decreases significantly, with node addition,only when block sizes are 32MB or 64MB, and following the completion time behaviourobserved in Figures 4.5(a) and 4.5(b).

For the dataset of 30Gb of network traffic the same experiments were conducted, andthe results about MapReduce phases evaluation presented a behaviour quite similar to theresults already presented by 90Gb experiments, for DPI and packet level analysis.

Relevant differences were identified for speed-up, completion time and scalabilityevaluation, as shown by Figures 4.9(a) and 4.9(b), which exhibit the completion time andprocessing capacity scalability of MapReduce for DPI of 30Gb of network traffic, withvariation in cluster size and block size.

0

50

100

150

200

250

300

350

02 06 10 18 21 25 29

2

3

4

5

6

7

8

9

10

Tim

e(s

)

Sp

ee

d−

up

Nodes




(a): DPI Completion Time and Speed-up of MapReduce for 30Gbof a JXTA-application network traffic

100

150

200

250

300

350

400

450

500

550

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

10

20

30

40

50

60

70

80

Th

rou

gh

pu

t (M

bp

s)

Th

rou

gh

pu

t/N

od

es (

Mb

ps)

Nodes




(b): DPI Processing Capacity of 30Gb

Figure 4.9 DPI Completion Time and Processing capacity for 90Gb

54

4.4. DISCUSSION

The completion time of DPI of 30Gb scales significantly up to 10 nodes; then theexperiment presents no more gains with node addition using block size of 128MB, andpresents a low increase of completion time in cases using blocks of 32MB and 64MB.This behaviour is the same observed for job completion time for 90Gb, as shown inFigures 4.5(a) and 4.5(b), but with significant scaling just up to 10 nodes for the datasetof 30Gb, while it was achieved scaling up to 25 nodes for 90Gb.

Figure 4.9(a) shows that a completion time of 199.12 seconds was obtained with 2nodes using block of 128MB, scaling up to 87.33 seconds with 10 nodes and the sameblock size, while it was achieved respectively 474.44 and 147.12 seconds by the sameconfiguration for DPI of 90Gb, as shown in Figure 4.1.

Although the case for processing of 90Gb (Figure 4.1) had processed a dataset 3times bigger than the case of 30Gb (Figure 4.9(a)), the completion time achieved in allcases for processing of 90Gb was smaller than 3 times the completion time for processingof 30Gb. For the cases with 2 and 10 nodes using block of 128MB, it was consumedrespectively 2.38 and 1.68 times more time to process a dataset 90Gb, which is a dataset3 times bigger than the dataset of 30Gb.

Figure 4.9(b) shows the processing capacity for DPI of 30Gb. The maximum speed-upachieved for DPI of 30Gb was 7.90 times, using block of 32MB and 29 worker nodes,while it was achieved the maximum speed-up of 16.19 times for DPI of 90Gb with 28nodes.

From these results, it is possible to conclude that the MapReduce efficiency fits betterfor bigger data and in some cases can be more efficient to accumulate input data toprocess a bigger amount of data. Therefore it is important to analyse the dataset size tobe processed, and to quantify the ideal number of allocated nodes for each job, in orderto avoid wasting resources.

4.4 Discussion

In this section, we discuss the measured results and evaluate their meaning, restrictionsand opportunities. We also discuss possible threats to the validity of our experiment.

4.4.1 Results Discussion

According to the processing capacity presented in our experimental results, of the eval-uation of MapReduce for packet level analysis and DPI, it is possible to see that theMapReduce adoption for packet level analysis and DPI provided high processing capacity

55

4.4. DISCUSSION

and speed-up of completion time, when compared with a solution without distributedprocessing, making it possible to evaluate a large amount of network traffic and extractinformation from distributed applications of an evaluated data center.

The block size adopted and the number of nodes allocated to data processing areimportant factors for obtaining an efficient job completion time and processing capacityscalability. Some benchmarks show that MapReduce performance can be improved byan optimal block size choice (Jiang et al., 2010), showing better performance with theadoption of bigger block sizes. We evaluated the impact of the block size for packet levelanalysis and DPI workloads; blocks with 128MB provided a better completion time forsmaller clusters, but blocks with 64MB performed better in bigger clusters. Thus, in orderto obtain an optimal completion time adopting bigger block size it is necessary also toevaluate the node allocation for the MapReduce job, due to the variation in block sizeand cluster size can cause significant impact into completion time.

The different processing capacity achieved for processing datasets of 30Gb and 90Gbhighlights the efficiency of MapReduce for dealing with bigger data processing, andthat can be more efficient to accumulate input data, to process a larger amount of data.Therefore, it is important to analyse the dataset size to be processed, and to quantify theideal number of allocated nodes for each job, in order to avoid wasting resources.

The evaluation of the dataset size and the optimal number of nodes is important tounderstand how to schedule MapReduce jobs and resource allocation through specificHadoop schedulers, such as Capacity Scheduler and Fair Scheduler (Zaharia et al., 2010),in order to avoid resource wasting with allocation of nodes that will not produce significantgains (Verma et al., 2012a). Thus, the variation of processing capacity achieved in outexperiments highlights the importance of evaluation of the cost of node allocation and itsbenefits, and the need to evaluate the ideal size of pools in the Hadoop cluster, to obtainefficiency between the cluster size allocated to process an input size, and the resourcesharing of a Hadoop cluster.

The MapReduce processing capacity does not scale proportionally to node addition;in some cases there is not significant processing capacity which increases with nodeaddition, as shown in Figure 4.1, where jobs using block size of 64MB and 128MB inclusters with more than 14 nodes for DPI of 90Gb, present no significant completion timegain with node addition.

The number of execution waves is a factor that must be evaluated (Kavulya et al.,2010) when MapReduce scalability is analysed, due to the execution time decrease wisrelated to the number of execution waves necessary to process all input data. The number

56

4.4. DISCUSSION

of execution waves is defined by the available slots for execution of Map and Reducetasks; for example, if a MapReduce job is divided into 10 tasks in a cluster with 5 availableslots, then it will be necessary to have 2 execution waves for all tasks be executed.

Figure 4.9(a) shows a case of DPI of 30Gb, using block size of 128MB, in which therewas not a reduction of completion time with cluster size bigger than 10 nodes, becausethere was not a reduction in the number of execution waves. But in our experiments,cases with a reduction of execution waves also presented no significant reduction ofcompletion time, as cases using block size of 128MB in clusters with 21 nodes or more,for DPI and packet level analysis, showed in Figure 4.1. Thus, node addition or tasksdistribution must be evaluated for resources usage optimization and to avoid additional orunnecessary costs with machines and power consumption.

The comparison of completion time between CountUpDriver and P3 shows that P3,which splits the data into packets, performs better than CountUpDriver, which processesa whole block without splitting. Processing a whole block as input, the local nodeparallelism is limited to the number of slots per node, while in the divisible approacheach split can be processed by an independent thread, increasing the possible parallelism.Because some cases require data without splitting, such as DPI and video processingcases (Pereira et al., 2010), improvements for this issue must be evaluated, consideringbetter schedulers, data location and task assignment.

The behavioural evaluation of MapReduce phases showed that the Map phase ispredominant in total execution time for packet level analysis and DPI, with Shuffle beingthe second most expressive phase. Shuffle can overlap Map phase, and this conditionmust be considered in MapReduce evaluations, specially in our case, due to the overlapof Map and Shuffle which represents more than 50% of total time execution.

The long time of "Map and Shuffle" phase represents a long time of Shuffle tasksbeing executed in parallel with Map tasks, and a long time of slots allocated for Shuffletasks that only will be concluded after all Map tasks are finished, although these Shuffletasks can be longer than the time required to read and process the generated intermediatedata. If there are slots allocated for Shuffle tasks that are only waiting for Map phaseconclusion, these slots could be used for other task executions, which could acceleratethe job completion time.

With the increase of cluster size and reduction of job completion time, it was observedthat the Map phase showed a proportional decrease, while Shuffle phase increased withthe growth of the number of nodes. With more nodes, the intermediate data generatedby Map tasks is placed in more nodes, which are responsible for shuffling the data and

57


sending them to specific Reducers, increasing the amount of remote I/O from Mappers toReducers and the number of data sources for each Reducer. Shuffle phase may representa bottleneck (Zhang et al., 2009) for scalability and could be optimized, due to I/Orestrictions (Lee et al., 2012; Akram et al., 2012) and data locality issues for Reducephase (Hammoud and Sakr, 2011).

Information extracted from the analysed results about the performance obtainedwith specific cluster, block and input sizes, is important for configuring MapReduceresource allocation and specialized schedulers, such as the Fair Scheduler (Zaharia et al.,2008), which defines pool sizes and resource share for MapReduce jobs. Thus, withthe information of the performance achieved with specific resources, it is possible toconfigure MapReduce parameters in order to obtain efficiency between the resourceallocation, and the expected completion time or resource sharing (Zaharia et al., 2008,2010).

4.4.2 Possible threats to validity

In this chapter we evaluated for packet level analysis, a port counter implemented withP3. It was used a version of this implementation published in Lee et al. (2011) website2,obtained on 2012 February, when a complete binary version was available, which wasused in our experiments, but this binary version is currently not available.

Part of the P3 source code was published later, but not all necessary code to compileall binary libraries necessary to evaluate the P3 implementation of a port counter. Therebyit is important to highlight that the obtained results, through our evaluation, was for theP3 version obtained on 2012 February from Lee et al. (2011) website.

It is also important to highlight that the DPI can present restrictions to evaluateencrypted messages, and that the obtained results are specifics for the input datasets,factors, levels and experiment setup used in our evaluation.

4.5 Chapter Summary

In this chapter, we evaluated the performance of MapReduce for packet level analysis andDPI of applications traffic. We evaluated how data input, block and cluster sizes, impactson MapReduce phases, job completion time, processing capacity scalability and on thespeed-up achieved in comparison with the same algorithm executed by a non-distributed

2https://sites.google.com/a/networks.cnu.ac.kr/yhlee/p3

58


implementation.The results show that MapReduce presents high processing capacity for dealing with

massive application traffic analysis. The behaviour of MapReduce phases over variationof block size and cluster size was evaluated; we verified that packet level analysis andDPI are Map-intensive jobs, and that Map phase consumes more than 70% of executiontime, with Shuffle phase being the second predominant phase.

We showed that input size, block size and cluster size are important factors to beconsidered to achieve better job completion time and to explore MapReduce scalabilityand efficient resource allocation, due to the variation in completion time provided by theblock size adopted and, in some cases, due to the processing capacity which does notincrease with node addition into the cluster.

We also showed that using a whole block as input for Map functions, achieved apoorer performance than using divisible data, thereby more evaluation is necessary tounderstand how it can be handled and improved.

59

5Conclusion and Future Work

The softest things in the world overcome the hardest things in the world.

—LAO TZU

Distributed systems has been adopted for building modern Internet services and cloudcomputing infrastructure. The detection of error causes, diagnose and reproduction oferrors of distributed systems are challenges that motivate efforts to develop less intrusivemechanisms for monitoring and debugging distributed applications at runtime.

Network traffic analysis is one option for distributed systems measurement, althoughthere are limitations on capacity to process large amounts of network traffic in short time,and on scalability to process network traffic where there is variation of resource demand.

In this dissertation we proposed an approach to perform deep inspection in distributedapplications network traffic, in order to evaluate distributed systems at a data centerthrough network traffic analysis, using commodity hardware and cloud computing ser-vices, in a minimally intrusive way. Thus we developed an approach based onMapReduce,to evaluate the behavior of a JXTA-based distributed system through DPI.

We evaluated the effectiveness of MapReduce to implement a DPI algorithm andits completion time scalability to measure a JXTA-based application, using virtual ma-chines of a cloud computing provider. Also, was deeply evaluated the performance ofMapReduce for packet-level analysis and DPI, characterizing the behavior followed byMapReduce phases, its processing capacity scalability and speed-up, over variations of

60

5.1. CONCLUSION

input size, block size and cluster size.

5.1 Conclusion

With our proposed approach, it is possible to measure the network traffic behavior ofdistributed applications with intensive network traffic generation, through the offlineevaluation of information from the production environment of a distributed system,making it possible to use the information from the evaluated indicators, to diagnoseproblems and analyse performance of distributed systems.

We showed that MapReduce programming model can express algorithms for DPI, asthe Algorithm 1, implemented to extract application indicators from the network trafficof a JXTA-based distributed application. We analysed the completion time scalabilityachieved for different number of nodes in a Hadoop cluster composed of virtual machines,with different size of network traffic used as input. We showed the processing capacityand the completion time scalability achieved, and also was showed the influence of thenumber of nodes and the data input size in the processing capacity for DPI using virtualmachines of Amazon EC2, for a selected scenario.

We evaluated the performance of MapReduce for packet level analysis and DPI ofapplications traffic, using commodity hardware, and showed how data input size, blocksize and cluster size cause relevant impacts into MapReduce phases, job completion time,processing capacity scalability and in the speedup achieved in comparison against thesame execution by a non distributed implementation.

The results showed that although MapReduce presents a good processing capacityusing cloud services or commodity computers for dealing with massive applicationtraffic analysis, but it is necessary to evaluate the behaviour of MapReduce to processspecifics data type, in order to understand its relation with the available resources andthe configuration of MapReduce parameters, and to obtain an optimal performance forspecific environments.

We showed that MapReduce processing capacity scalability is not proportional tonumber of allocated nodes, and the relative processing capacity decreases with nodeaddition. We showed that input size, block size and cluster size are important factors to beconsidered to achieve better job completion time and to explore MapReduce scalability,due to the observed variation in completion time provided by different block size adopted.Also, in some cases, the processing capacity does not scale with node addition intothe cluster, what highlights the importance of allocating resources according with theworkload and input data, in order to avoid wasting resources.

61

5.2. CONTRIBUTIONS

We verified that packet level analysis and DPI are Map-intensive jobs, due to Mapphase consumes more than 70% of the total job completion time, and shuffle phase isthe second predominant phase. We also showed that using whole block as input for Mapfunctions, it was achieved a poor completion time than the approach which splits theblock into records.

5.2 Contributions

We attempt to analyse the processing capacity problem of measurement of distributed sys-tems through network traffic analysis, the results of the work presented in this dissertationprovide the contributions below:

1. We proposed an approach to implements DPI algorithms throughMapReduce,using whole blocks as input for Map functions. It was shown the effectiveness ofMapReduce for a DPI algorithm to extract indicators from a distributed appli-cation traffic, also it was shown the completion time scalability of MapReducefor DPI, using virtual machines of a cloud provider;

2. We developed JNetPCAP-JXTA (Vieira, 2012b), an open source parser to extractJXTA messages from network traffic traces;

3. We developed Hadoop-Analyzer (Vieira, 2013), an open source tool to extractindicators from Hadoop logs and generate graphs of specified metrics.

4. We characterized the behavior followed by MapReduce phases for packetlevel analysis and DPI, showing that this kind of job is intense in Map phaseand highlighting points that can be improved;

5. We described the processing capacity scalability of MapReduce for packetlevel analysis and DPI, evaluating the impact caused by variations in inputsize, cluster size and block size;

6. We showed the speed-up obtained with MapReduce for DPI, with variations ininput size, cluster size and block size;

7. We published two papers reporting our results, as follows:

(a) Vieira, T., Soares, P., Machado, M., Assad, R., and Garcia, V. EvaluatingPerformance of Distributed Systems with MapReduce and Network Traffic

62

5.2. CONTRIBUTIONS

Analysis. In ICSEA 2012, The Seventh International Conference on SoftwareEngineering Advances. Xpert Publishing Services.

(b) Vieira, T., Soares, P., Machado, M., Assad, R., and Garcia, V. MeasuringDistributed Applications Through MapReduce and Traffic Analysis. In Paralleland Distributed Systems (ICPADS), 2012 IEEE 18th International Conferenceon, pages 704 - 705.

5.2.1 Lessons Learned

The contributions cited are of scientific and academic scope, with implementations andevaluations little explored in the literature. However, with the development of this work,some important lessons were learned.

During this research, different approaches for evaluating distributed systems of cloudcomputing providers were studied. In this period, we could see the importance of theperformance evaluation in a cloud computing environment, and the recent efforts todiagnose and evaluate system at production environment of a data center. Also, thegrowth of the Internet and resource utilization make necessary solutions to be able toevaluate large amounts of data in short time, with low performance degradation of theevaluated system.

MapReduce has grown as a general purpose solution for big data processing, but itis not a solution for all kind of problems, and its performance is dependent of severalparameters. Some researches has been done in order to improve MapReduce perfor-mance, through analytical modelling, simulation and measurement, but the most relevantcontributions in this direction was guided by realistic workload evaluations, from largeMapReduce clusters.

We learned that although the facilities provided by the MapReduce for distributedprocessing, its performance is influenced by the environment, network topology, workload,data type and by several specific parameter configurations. Therefore, an evaluation ofthe MapReduce behavior using data of a realistic environment will provide more accurateand wide results, while in controlled experiments the results are more restricted andlimited to the evaluated metrics and factors.

63

5.3. FUTURE WORK

5.3 Future Work

Because of time constraints imposed on the master degree, this dissertation addressessome problems, but some problems are still open and others are emerging from currentresults. Thus, the following issues should be investigated as future work:

• Evaluating of all components of the proposed approach. This dissertation eval-uated the JNetPCAP-JXTA, the AppAnalyzer and its implementation to evaluate aJXTA-based distributed application, it is necessary to evaluate the SnifferServer,Manager and the whole system working together, analysing their impact into themeasured distributed system and the scalability achieved;

• Development of a technique for the efficient evaluation of distributed sys-tems through information extracted from network traffic. This dissertationaddressed the problem of processing capacity for measuring distributed systemsthrough network traffic analysis, but it is necessary an efficient approach to di-agnose problems of distributed systems, using information of flows, connections,throughput and response time obtained from network traffic analysis;

• Development of a analytic model and simulations, using information of MapRe-duce behavior for network traffic analysis, measured by this dissertation, to repro-duce its characteristics and enable the evaluation and prediction of some cases ofMapReduce for network traffic analysis;

64

Bibliography

Aguilera, M. K., Mogul, J. C., Wiener, J. L., Reynolds, P., and Muthitacharoen, A. (2003).Performance debugging for distributed systems of black boxes. SIGOPS Oper. Syst.Rev., 37(5).

Akram, S., Marazakis, M., and Bilas, A. (2012). Understanding scalability and per-formance requirements of I/O-intensive applications on future multicore servers. InModeling, Analysis Simulation of Computer and Telecommunication Systems (MAS-

COTS), 2012 IEEE 20th International Symposium on.

Antonello, R., Fernandes, S., Kamienski, C., Sadok, D., Kelner, J., Godorc, I., Szaboc, G.,and Westholm, T. (2012). Deep packet inspection tools and techniques in commodityplatforms: Challenges and trends. Journal of Network and Computer Applications.

Antoniu, G., Hatcher, P., Jan, M., and Noblet, D. (2005). Performance evaluation of jxtacommunication layers. In Cluster Computing and the Grid, 2005. CCGrid 2005. IEEEInternational Symposium on, volume 1, pages 251 – 258 Vol. 1.

Antoniu, G., Cudennec, L., Jan, M., and Duigou, M. (2007). Performance scalabilityof the jxta p2p framework. In Parallel and Distributed Processing Symposium, 2007.IPDPS 2007. IEEE International, pages 1 –10.

Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R., Konwinski, A., Lee, G.,Patterson, D., Rabkin, A., Stoica, I., and Zaharia, M. (2010). A view of cloudcomputing. Commun. ACM, 53, 50–58.

Basili, V. R., Caldiera, G., and Rombach, H. D. (1994). The goal question metricapproach. In Encyclopedia of Software Engineering. Wiley.

Bhatotia, P., Wieder, A., Akkus, I. E., Rodrigues, R., and Acar, U. A. (2011). Large-scale incremental data processing with change propagation. In Proceedings of the 3rdUSENIX conference on Hot topics in cloud computing, HotCloud’11, pages 18–18,Berkeley, CA, USA. USENIX Association.

Callado, A., Kamienski, C., Szabo, G., Gero, B., Kelner, J., Fernandes, S., and Sadok, D.(2009). A survey on internet traffic identification. Communications Surveys Tutorials,IEEE, 11(3), 37 –52.

65

BIBLIOGRAPHY

Chen, Y., Ganapathi, A., Griffith, R., and Katz, R. (2011). The case for evaluatingMapReduce performance using workload suites. In Modeling, Analysis Simulation ofComputer and Telecommunication Systems (MASCOTS), 2011 IEEE 19th International

Symposium on.

Condie, T., Conway, N., Alvaro, P., Hellerstein, J. M., Elmeleegy, K., and Sears, R. (2010).MapReduce online. In Proceedings of the 7th USENIX conference on Networkedsystems design and implementation, pages 21–21.

Cox, L. P., Murray, C. D., and Noble, B. D. (2002). Pastiche: making backup cheap andeasy. SIGOPS Oper. Syst. Rev., 36, 285–298.

Dean, J. and Ghemawat, S. (2008). MapReduce: simplified data processing on largeclusters. Commun. ACM, 51, 107–113.

DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A.,Sivasubramanian, S., Vosshall, P., and Vogels, W. (2007). Dynamo: Amazon’s highlyavailable key-value store. SIGOPS Oper. Syst. Rev., 41, 205–220.

Duigou, M. (2003). Jxta v2.0 protocols specification. Technical report, IETF InternetDraft.

Eddelbuettel, D. (2012). R in action. Journal of Statistical Software, Book Reviews, 46(2),1–2.

Fernandes, S., Antonello, R., Lacerda, T., Santos, A., Sadok, D., and Westholm, T. (2009).Slimming down deep packet inspection systems. In INFOCOM Workshops 2009,

IEEE.

Fonseca, A., Silva, M., Soares, P., Soares-Neto, F., Garcia, V., and Assad, R. (2012). Umaproposta arquitetural para serviços escaláveis de dados em nuvens. In Proceedings ofthe VIII Workshop de Redes Dinâmicas e Sistemas P2P.

Fox, A., Griffith, R., Joseph, A., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin,A., and Stoica, I. (2009). Above the clouds: A berkeley view of cloud computing.Dept. Electrical Eng. and Comput. Sciences, University of California, Berkeley, Rep.

UCB/EECS, 28.

Ghemawat, S., Gobioff, H., and Leung, S.-T. (2003). The Google file system. SIGOPSOper. Syst. Rev.

66

BIBLIOGRAPHY

Groot, S. (2012). Modeling I/O interference in data intensive map-reduce applications. InApplications and the Internet (SAINT), 2012 IEEE/IPSJ 12th International Symposium

on.

Gunther, N. (2006). Guerrilla Capacity Planning: A Tactical Approach to Planning forHighly Scalable Applications and Services. Springer.

Guo, Z., Fox, G., and Zhou, M. (2012). Investigation of data locality and fairness inMapReduce. In Proceedings of third international workshop on MapReduce and itsApplications Date, MapReduce ’12.

Gupta, D., Vishwanath, K. V., McNett, M., Vahdat, A., Yocum, K., Snoeren, A., andVoelker, G. M. (2011). Diecast: Testing distributed systems with an accurate scalemodel. ACM Trans. Comput. Syst., 29, 4:1–4:48.

Halepovic, E. (2004). Performance evaluation and benchmarking of the JXTA peer-to-peer platform. Ph.D. thesis, University of Saskatchewan.

Halepovic, E. and Deters, R. (2003). The costs of using jxta. In Peer-to-Peer Computing,2003. (P2P 2003). Proceedings. Third International Conference on, pages 160 – 167.

Halepovic, E. and Deters, R. (2005). The jxta performance model and evaluation. FutureGener. Comput. Syst., 21, 377–390.

Halepovic, E., Deters, R., and Traversat, B. (2005). Jxta messaging: Analysis of feature-performance tradeoffs and implications for system design. In R. Meersman andZ. Tari, editors, On the Move to Meaningful Internet Systems 2005: CoopIS, DOA, andODBASE, volume 3761, pages 1097–1114. Springer Berlin / Heidelberg.

Hammoud, M. and Sakr, M. (2011). Locality-aware reduce task scheduling for MapRe-duce. In Cloud Computing Technology and Science (CloudCom), 2011 IEEE ThirdInternational Conference on, pages 570 –576.

Jacobson, V., Leres, C., and McCanne, S. (1994). libpcap. http://www.tcpdump.org/.

Jain, R. (1991). The art of computer systems performance analysis - techniques forexperimental design, measurement, simulation, and modeling. Wiley professionalcomputing. Wiley.

Janert, P. K. (2010). Gnuplot in action: understanding data with graphs. Manning,Greenwich, CT.

67

BIBLIOGRAPHY

Jiang, D., Ooi, B. C., Shi, L., and Wu, S. (2010). The performance of MapReduce: anin-depth study. Proc. VLDB Endow.

Kambatla, K., Pathak, A., and Pucha, H. (2009). Towards optimizing hadoop provisioningin the cloud. In Proc. of the First Workshop on Hot Topics in Cloud Computing.

Kandula, S., Sengupta, S., Greenberg, A., Patel, P., and Chaiken, R. (2009). The natureof data center traffic: measurements & analysis. In Proceedings of the 9th ACMSIGCOMM conference on Internet measurement conference, IMC ’09, pages 202–208,New York, NY, USA. ACM.

Kavulya, S., Tan, J., Gandhi, R., and Narasimhan, P. (2010). An analysis of traces froma production MapReduce cluster. In Cluster, Cloud and Grid Computing (CCGrid),2010 10th IEEE/ACM International Conference on.

Lämmel, R. (2007). Google’s MapReduce programming model - revisited. Sci. Comput.Program., 68(3), 208–237.

Lee, G. (2012). Resource Allocation and Scheduling in Heterogeneous Cloud Environ-ments. Ph.D. thesis, University of California, Berkeley.

Lee, K.-H., Lee, Y.-J., Choi, H., Chung, Y. D., and Moon, B. (2012). Parallel dataprocessing with MapReduce: a survey. SIGMOD Rec.

Lee, Y., Kang, W., and Son, H. (2010). An internet traffic analysis method with MapRe-duce. In Network Operations and Management Symposium Workshops (NOMS Wksps),2010 IEEE/IFIP, pages 357 –361.

Lee, Y., Kang, W., and Lee, Y. (2011). A hadoop-based packet trace processing tool. InProceedings of the Third international conference on Traffic monitoring and analysis,TMA’11.

Lin, H., Ma, X., Archuleta, J., Feng, W., Gardner, M., and Zhang, Z. (2010). Moon:MapReduce on opportunistic environments. In Proceedings of the 19th ACM Inter-

national Symposium on High Performance Distributed Computing, pages 95–106.ACM.

Lin, J. (2012). Mapreduce is good enough? if all you have is a hammer, throw awayeverything that’s not a nail! Big Data.

68

BIBLIOGRAPHY

Loiseau, P., Goncalves, P., Guillier, R., Imbert, M., Kodama, Y., and Primet, P.-B. (2009).Metroflux: A high performance system for analysing flow at very fine-grain. InTestbeds and Research Infrastructures for the Development of Networks Communities

and Workshops, 2009. TridentCom 2009. 5th International Conference on, pages 1 –9.

Lu, P., Lee, Y. C., Wang, C., Zhou, B. B., Chen, J., and Zomaya, A. Y. (2012). Workloadcharacteristic oriented scheduler for MapReduce. In Parallel and Distributed Systems(ICPADS), 2012 IEEE 18th International Conference on, pages 156 –163.

Massie, M. L., Chun, B. N., and Culler, D. E. (2004). The Ganglia distributed monitoringsystem: design, implementation, and experience. Parallel Computing, 30(7), 817 –840.

Mi, H., Wang, H., Yin, G., Cai, H., Zhou, Q., and Sun, T. (2012). Performance problemsdiagnosis in cloud computing systems by mining request trace logs. In NetworkOperations and Management Symposium (NOMS), 2012 IEEE.

Nagaraj, K., Killian, C., and Neville, J. (2012). Structured comparative analysis ofsystems logs to diagnose performance problems. In Proceedings of the 9th USENIXconference on Networked Systems Design and Implementation, NSDI’12.

Oliner, A., Ganapathi, A., and Xu, W. (2012). Advances and challenges in log analysis.Commun. ACM, 55(2), 55–61.

Paul, D. (2010). JXTA-Sim2: A Simulator for the core JXTA protocols. Master’s thesis,University of Dublin, Ireland.

Pereira, R., Azambuja, M., Breitman, K., and Endler, M. (2010). An architecture fordistributed high performance video processing in the cloud. In Cloud Computing(CLOUD), 2010 IEEE 3rd International Conference on.

Piyachon, P. and Luo, Y. (2006). Efficient memory utilization on network processorsfor deep packet inspection. In Proceedings of the 2006 ACM/IEEE Symposium on

Architecture for Networking and Communications Systems, pages 71–80. ACM.

Risso, F., Baldi, M., Morandi, O., Baldini, A., and Monclus, P. (2008). Lightweight,payload-based traffic classification: An experimental evaluation. In Communications,2008. ICC’08. IEEE International Conference on, pages 5869–5875. IEEE.

69

BIBLIOGRAPHY

Rumen (2012). Rumen, a tool to extract job characterization data from job trackerlogs. http://hadoop.apache.org/docs/MapReduce/r0.22.0/rumen.html. [Acessado emdezembro de 2012].

Sambasivan, R. R., Zheng, A. X., De Rosa, M., Krevat, E., Whitman, S., Stroucken, M.,Wang, W., Xu, L., and Ganger, G. R. (2011). Diagnosing performance changes bycomparing request flows. In Proceedings of the 8th USENIX conference on Networkedsystems design and implementation, NSDI’11.

Shafer, J., Rixner, S., and Cox, A. (2010). The hadoop distributed filesystem: Balancingportability and performance. In Performance Analysis of Systems Software (ISPASS),2010 IEEE International Symposium on.

Sigelman, B. H., Barroso, L. A., Burrows, M., Stephenson, P., Plakal, M., Beaver, D.,Jaspan, S., and Shanbhag, C. (2010). Dapper, a large-scale distributed systems tracinginfrastructure. Technical report, Google, Inc.

Tan, J., Meng, X., and Zhang, L. (2012). Coupling scheduler for MapReduce/hadoop. InProceedings of the 21st international symposium on High-Performance Parallel and

Distributed Computing, HPDC ’12.

Verma, A., Cherkasova, L., Kumar, V., and Campbell, R. (2012a). Deadline-basedworkload management for MapReduce environments: Pieces of the performancepuzzle. In Network Operations and Management Symposium (NOMS), 2012 IEEE.

Verma, A., Cherkasova, L., and Campbell, R. (2012b). Two sides of a coin: Optimizingthe schedule of MapReduce jobs to minimize their makespan and improve clusterperformance. In Modeling, Analysis Simulation of Computer and TelecommunicationSystems (MASCOTS), 2012 IEEE 20th International Symposium on.

Vieira, T. (2012a). hadoop-dpi. http://github.com/tpbvieira/hadoop-dpi.

Vieira, T. (2012b). jnetpcap-jxta. http://github.com/tpbvieira/jnetpcap-jxta.

Vieira, T. (2013). hadoop-analyzer. http://github.com/tpbvieira/hadoop-analyzer.

Vieira, T., Soares, P., Machado, M., Assad, R., and Garcia, V. (2012a). Evaluatingperformance of distributed systems with MapReduce and network traffic analysis. InICSEA 2012, The Seventh International Conference on Software Engineering Advances.Xpert Publishing Services.

70

BIBLIOGRAPHY

Vieira, T., Soares, P., Machado, M., Assad, R., and Garcia, V. (2012b). Measuringdistributed applications through MapReduce and traffic analysis. In Parallel andDistributed Systems (ICPADS), 2012 IEEE 18th International Conference on, pages704 –705.

Wang, G., Butt, A., Pandey, P., and Gupta, K. (2009). A simulation approach to evaluatingdesign decisions in MapReduce setups. In Modeling, Analysis Simulation of Computerand Telecommunication Systems, 2009. MASCOTS ’09. IEEE International Symposium

on.

Yu, M., Greenberg, A., Maltz, D., Rexford, J., Yuan, L., Kandula, S., and Kim, C. (2011).Profiling network performance for multi-tier data center applications. In Proceedingsof the 8th USENIX conference on Networked systems design and implementation,NSDI’11.

Yuan, D., Zheng, J., Park, S., Zhou, Y., and Savage, S. (2011). Improving softwarediagnosability via log enhancement. In ACM SIGARCH Computer Architecture News,volume 39, pages 3–14. ACM.

Zaharia, M., Konwinski, A., Joseph, A. D., Katz, R., and Stoica, I. (2008). ImprovingMapReduce performance in heterogeneous environments. In Proceedings of the 8thUSENIX conference on Operating systems design and implementation, OSDI’08.

Zaharia, M., Borthakur, D., Sen Sarma, J., Elmeleegy, K., Shenker, S., and Stoica, I.(2010). Delay scheduling: a simple technique for achieving locality and fairnessin cluster scheduling. In Proceedings of the 5th European conference on Computersystems, EuroSys ’10.

Zhang, S., Han, J., Liu, Z., Wang, K., and Feng, S. (2009). Accelerating MapReducewith distributed memory cache. In Parallel and Distributed Systems (ICPADS), 200915th International Conference on.

Zheng, Z., Yu, L., Lan, Z., and Jones, T. (2012). 3-dimensional root cause diagnosisvia co-analysis. In Proceedings of the 9th international conference on Autonomiccomputing, pages 181–190. ACM.

71

Documents

An Approach for Profiling Distributed Applications Through Network Traffic Analysis