24
Click to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta Harish Kumar Narware Harsh Agrawal Sourabh Gupta

MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

Click to edit Master subtitle style 11/23/09

MapReduce Jobs For Video Conversion

Ankur GuptaHarish Kumar NarwareHarsh AgrawalSourabh Gupta

Page 2: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Agenda• Motivation

• Introduction

• Why MapReduce ?

• What is FFmpeg ?

• Project Description

• Challenges Faced

• Load Balancing ( Optimization )

• Practical Use

Page 3: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Motivation

Page 4: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Motivation• MapReduce is a software framework introduced

by Google.

• It supports distributed computing on large datasets on clusters of computer.

• The framework is inspired by map and reduce functions commonly used in programming.

• Example,

MapReduce can sort a petabyte of data in only few hours.

Page 5: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Introduction

• In the project we have to convert huge number of video files from one format to another.

• We are using the MapReduce framework .

• We are also using the open source video converter FFMPEG .

• The data will be retrieved and stored on HDFS .

Page 6: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Why MapReduce ?• We need MapReduce since the number of video

files to be converted is huge .

• Using parallelism provided by MapReduce we can complete the task in less time .

• Distributed computing also provides better utilization of resources .

Page 7: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

What is FFMPEG ?• FFmpeg is a complete, cross–platform solution

to record, convert and stream audio and video.

• FFmpeg is free software and is licensed under the LGPL or GPL .

• FFmpeg can be installed via downloading using SVN from the following link

http://ffmpeg.org/download.html

Page 8: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Project Description• Video files in a particular format, say AVI, will be

stored in HDFS .

• We will accept an input file containing locations of video files in HDFS and the format in which the file has to be converted.

• In Map phase we convert the video format .

• In Map phase firstly we downloaded the input video file from HDFS to local system using filesystem API’s (copyToLocalfile())

Page 9: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Project Description (cont.)• We used FFmpeg to convert this file into given

format .

• Then this new file is uploaded back into HDFS using API copyFromLocalfile() in the same directory with same name but with the extension of new video format.

• The HDFS path of the new files is then returned as output of the Map task.

• Reduce is not needed.

Page 10: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Commands • FileSystem hdfs = FileSystem.get(config);

• hdfs.copyToLocalFile(srcPath, dstPath);

• copyToLocalFile copies the file from srcPath in HDFS to dstPath in local system.

• hdfs.copyFromLocalFile(srcPath,dstPath);

• copyFromLocalFile copies the file at srcPath in local system to dstPath in HDFS . No file should be present at dstPath in HDFS .

Page 11: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Challenges we Faced !• An interesting problem we encountered was ,we

were not able to get the whole converted file using FFmpeg commands in Map task.

• Reason is when we run a command from a java program, it executes the command in a duplicate JVM (like a child process) , and our program was exiting before the child process could complete itself . Therefore only partial file was being converted .

Page 12: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

How we solved the problems ?• We declare a datastream where the standard

output of the ffmpeg command (running) is shared .

• We put a while loop which waits for the output of this datastream and breaks only when this datastream returns null that is when the conversion is complete .

• So , in this way , we waited for the duplicate jvm to complete the conversion in our map task .

Page 13: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Challenges we Faced !• The main challenge was to properly distribute

the input splits .

• Each input split should contain path of files to be converted such that the total video data to be converted remains approximately same .

• For example there should no be input splits such that it contains the path of all the video files having large size . If such a thing happens then it will unbalance the load .

Page 14: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Page 15: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Load Balancing ( Optimization )

• Load Balancing between map tasks is very crucial .

• An approach

• We sorted the records in the input file on the basis of file size in HDFS using mapreduce .

• Rewritten the input file by taking one file from the start and one file from the bottom an then second file and and second last file from sorted file .

Page 16: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Load Balancing ( continued )• So , when the equal number of video files are

given to map tasks , there will be some optimization in terms of total video data converted by a map task .

• But , still it is not the best method .

Page 17: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Load Balancing ( continued )• MapReduce provides function to set the number

of map tasks for a given job .

• Job.setNumMapTasks(x);

• Parameter is only a hint for the number of map tasks . Actual value depends upon the implementation getsplits function of customInputFormat Class .

• A lower bound on the split size can be set via mapred.min.split.size .

Page 18: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Input Format• Validate the input-specification of the job.

• Split-up the input file(s) into logical InputSplit instances , each of which is then assigned to an individual Mapper .

• Provide the RecordReader implementation used to glean input records from the logical Inputsplit for processing by the Mapper .

• Default implementation is to split the input into logical InputSplit instances based on the total size, in bytes, of the input files.

Page 19: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

New Approach !• Provide the InputFormat Implementation for the map task .

• In getsplit function of the InputFormat class , we divide the input split on basis of total size of video files for map task .

• In the function , we check the size of each file present in input in HDFS . And , when the total size of files exceeds a certain limit for a InputSplit , we create a new InputSplit .

• InputSplit is logical and consists of path of input file , start offset and the end offset .

Page 20: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

New Approach !( cont. )• Here , we can exactly define the number of map

tasks and the input for each map task .

• Set the Input Format for the job job.setInputFormat( CustomInputFormat.class).

Page 21: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Page 22: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Practical Use• There are many website which convert video files from one format to another online. They

can use this project to do so.

• Most of this websites do not use MapReduce right now.

• Example of such sites are,

• http://www.zamzar.com/

• http://www.any-video-converter.com/products/for_video_free/

• http://www.getafreelancer.com/projects/PHP-Python/Youtube-API-video-conversion-website.html

• http://vixy.net/

Page 23: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09

Questions ?

Page 24: MapReduce Jobs For Video Conversion - IIIT …search.iiit.ac.in/cloud/presentations/5.pdfClick to edit Master subtitle style 11/23/09 MapReduce Jobs For Video Conversion Ankur Gupta

11/23/09