45
SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK SPARK IRT, OCE PROJECT IRT, OCE PROJECT 1 of 45

SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

  • Upload
    others

  • View
    20

  • Download
    1

Embed Size (px)

Citation preview

Page 1: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USINGSATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USINGSPARKSPARK

IRT, OCE PROJECTIRT, OCE PROJECT

1 of 45

Page 2: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

Mathias ORTNER <[email protected]>Gregory FLANDIN <[email protected]>Marc SPIGAI <[email protected]>

2 of 45

Page 3: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

INTRODUCTIONINTRODUCTION

INTRODUCTIONINTRODUCTION

3 of 45

Page 4: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

BIG DATA ?BIG DATA ?

store, process, analyse large volumes of datamaximal volume not know a prioricost = c x Volumenature of data or processing may change

4 of 45

Page 5: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

SATELLITE IMAGE GROUND SEGMENTSATELLITE IMAGE GROUND SEGMENT

We have / expect :

large volume of images to storelarge volume of images to explorenew competitors from “big data world”

5 of 45

Page 6: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

OUR WORK !OUR WORK !

well known case : orthorectification of SPOT 6 images (L1 to L2)study HDFS + spark over google cloudwhich advantages / drawbackshow does it fit whith out usual High Performance approaches

6 of 45

Page 7: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

HADOOPHADOOP

WHAT IS HADOOP ?WHAT IS HADOOP ?

7 of 45

Page 8: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

IN SHORTIN SHORT

Distributed data storagestore large volumes1. ... over low cost hardware2.

scalableMapReduceNot Posix

8 of 45

Page 9: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

DATA REPLICATIONDATA REPLICATION

Data is stored on several computers (a cluster)Data is split into pieces (typically 64M Bytes)HDFS is not POSIX (set of commands)One namenode, several datanodesFS aware of data locallity

9 of 45

Page 10: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

SPLIT AND CLONESPLIT AND CLONE

10 of 45

Page 11: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

AND DISTRIBUTEAND DISTRIBUTE

11 of 45

Page 12: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

MAPREDUCE (MAP)MAPREDUCE (MAP)

12 of 45

Page 13: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

MAPREDUCE (REDUCE)MAPREDUCE (REDUCE)

13 of 45

Page 14: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

EXAMPLE (LS)EXAMPLE (LS)[ortner@cluster-data-master ~]$ hdfs dfs -ls /Found 2 items-rw-r--r-- 2 ortner supergroup 1030588144 2015-04-28 19:09 /brisbane1-T4000-BP.avro

14 of 45

Page 15: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

EXAMPLE (PUT)EXAMPLE (PUT)[ortner@cluster-data-master ~]$ hdfs dfs -put temp/brisbane1-T1000-BB.avro /[ortner@cluster-data-master ~]$ hdfs dfs -ls /Found 2 items-rw-r--r-- 2 ortner supergroup 281398027 2015-04-30 14:57 /brisbane1-T1000-BB.avro-rw-r--r-- 2 ortner supergroup 1030588144 2015-04-28 19:09 /brisbane1-T4000-BP.avro

15 of 45

Page 16: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

EXAMPLE (STATUS OF FILE)EXAMPLE (STATUS OF FILE)[ortner@cluster-data-master ~]$ hdfs fsck /brisbane1-T4000-BP.avro -blocks -files -locationConnecting to namenode via http://cluster-data-master:50070FSCK started by ortner (auth:SIMPLE) from /10.240.11.148 for path /brisbane1-T4000-BP.avro a/brisbane1-T4000-BP.avro 1030588144 bytes, 8 block(s): OK0. BP-1139583743-10.240.69.224-1430248096224:blk_1073741825_1001 len=134217728 repl=2 1. BP-1139583743-10.240.69.224-1430248096224:blk_1073741826_1002 len=134217728 repl=2 2. BP-1139583743-10.240.69.224-1430248096224:blk_1073741827_1003 len=134217728 repl=2 3. BP-1139583743-10.240.69.224-1430248096224:blk_1073741828_1004 len=134217728 repl=2 4. BP-1139583743-10.240.69.224-1430248096224:blk_1073741829_1005 len=134217728 repl=2 5. BP-1139583743-10.240.69.224-1430248096224:blk_1073741830_1006 len=134217728 repl=2 6. BP-1139583743-10.240.69.224-1430248096224:blk_1073741831_1007 len=134217728 repl=2 7. BP-1139583743-10.240.69.224-1430248096224:blk_1073741832_1008 len=91064048 repl=2

Status: HEALTHY Total size: 1030588144 B Total dirs: 0 Total files: 1 Total symlinks: 0 Total blocks (validated): 8 (avg. block size 128823518 B) Minimally replicated blocks: 8 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %)

16 of 45

Page 17: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

FSCK ended at Thu Apr 30 15:04:22 UTC 2015 in 1 milliseconds

The filesystem under path '/brisbane1-T4000-BP.avro' is HEALTHY

17 of 45

Page 18: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

ORTHORECTIFICATIONORTHORECTIFICATION

WHAT IS ORTHORECTIFICATION ?WHAT IS ORTHORECTIFICATION ?

18 of 45

Page 19: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

ON GROUND PROJECTIONON GROUND PROJECTION

image is acquired in sensor geometry (L1)but the user needs a projection on ground (L2)Projection accounts for :

Time measurementsLine of sight calibrationSatellite attitude measurementsSatellite orbit measurementsTerrain model

19 of 45

Page 20: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

SPOT6 SPECIFICITY : MULTIPLE SENSORSSPOT6 SPECIFICITY : MULTIPLE SENSORS

The system swath is 60km, native resolution is 2.2m.

two cameras (1 and 2),and in each camera, two retinas (A and B),in each retina, 5 bands (1 Pan and 4 Multispectral),Panchromatic band : 7000 pixels, and in the Multispectral bands 1500 pixels.

The L1 product is therefore actually made of 20 images.

20 of 45

Page 21: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

INPUT IS L1INPUT IS L1

We start from a Level 1 image which is the native image in the focal plane geometry

21 of 45

Page 22: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

OUTPUT IS L2OUTPUT IS L2

We produce a level 2 image, i.e. an image that is projected on the ground using aDigital Terrain model and a cartographic frame.

22 of 45

Page 23: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

IN BETWEEN :IN BETWEEN :

The production relies on four steps :

Dtm interpolationInverse localization (from ground to focal plane)Image interpolation (pixel lookup, B spline resampler)Fusion for overlapping parts

23 of 45

Page 24: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

IN BETWEEN :IN BETWEEN :

24 of 45

Page 25: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

TERRAIN MODEL IS A SHARED RESSOURCE (SRTM)TERRAIN MODEL IS A SHARED RESSOURCE (SRTM)

SRTM terrain model is a standard, freely available, and widely used terrain model.

25 of 45

Page 26: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

LARGE VOLUMES OF DATA TO BE HANDLEDLARGE VOLUMES OF DATA TO BE HANDLED

We focus in this study on a full PAN image production, with typical dimensions of

44754 x 49135 pixels (1X)110858 x 81327 pixels (4X)352829 x 42817 pixels (7X)

26 of 45

Page 27: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

APACHE SPARKAPACHE SPARK

WHAT IS SPARK ?WHAT IS SPARK ?

27 of 45

Page 28: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

APACHE PROJECTAPACHE PROJECT

Use distributed computation and map reduce easily

It is written in

akka, on top ofscala, on top ofjava

It has bindings in python, scala, java.

28 of 45

Page 29: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

RESILIENT DISTRIBUTED DATASETS (RDDS)RESILIENT DISTRIBUTED DATASETS (RDDS)

All data is stored in collections of objects called resilient distributed datasets (RDDs)

Collections are distributed on the network.

29 of 45

Page 30: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

ACTIONS ON RDDSACTIONS ON RDDS

We have different possible actions :

CreationMapping (transform a RDD into another one)ReductionsWriting, collection

30 of 45

Page 31: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

LAZY EVALUATIONLAZY EVALUATION

Spark is a functional language

Define rules...... only needed rules are applied

31 of 45

Page 32: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

GOOGLE CLOUD ENGINEGOOGLE CLOUD ENGINE

PAY (MODERATELY) FOR (HIGH) USAGE !PAY (MODERATELY) FOR (HIGH) USAGE !

32 of 45

Page 33: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

Create and use virtual machines....

WHAT IS IT ?WHAT IS IT ?

33 of 45

Page 34: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

EXAMPLE : CREATE A MACHINE !EXAMPLE : CREATE A MACHINE !#!/bin/bash

gcloud compute instances create cluster-data-master --image centos7-image-java-xvfb --disk name=data-disk device-name=sdb mode=rw --local-ssd interface=SCSI --metadata-from-file startup-script=startup.sh --machine-type n1-highmem-16

Created [https://www.googleapis.com/compute/v1/projects/XXXX/zones/europe-west1-b/instances/

NAME ZONE MACHINE_TYPE INTERNAL_IP EXTERNAL_IP STATUScluster-data-master europe-west1-b n1-highmem-16 10.240.5.82 104.155.0.44 RUNNING

34 of 45

Page 35: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

HOW MUCH DOES IT COST ?HOW MUCH DOES IT COST ?

Machine type Virtual CPUs Memory Typical price (USD) per hour

n1-standard-1 1 3.75GB $0.038

n1-standard-2 2 7.5GB $0.076

n1-standard-4 4 15GB $0.152

n1-standard-8 8 30GB $0.304

n1-standard-16 16 60GB $0.608

n1-highmem-2 2 13GB $0.096

n1-highmem-4 4 26GB $0.192

n1-highmem-8 8 52GB $0.384

n1-highmem-16 16 104GB $0.768

35 of 45

Page 36: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

RESULTRESULT

☺ ERGONOMY !☺ ERGONOMY !

Distribution of algorithm incredibly easy to write

36 of 45

Page 37: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

WHY USE HDFS / MAPREDUCE ?WHY USE HDFS / MAPREDUCE ?

37 of 45

Page 38: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

FIRST AND MAIN RESULTFIRST AND MAIN RESULT

input is on HDFSoutput is on HDFSproduction is made using Spark

38 of 45

Page 39: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

PRODUCE 4 UNITS ON 4 SLAVES...PRODUCE 4 UNITS ON 4 SLAVES...

39 of 45

Page 40: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

... OR 24 UNITS ON 12 SLAVES ...... OR 24 UNITS ON 12 SLAVES ...

40 of 45

Page 41: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

... 36 UNITS ON 18 SLAVES ...... 36 UNITS ON 18 SLAVES ...

41 of 45

Page 42: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

OR 48 ON 24 SLAVES ...OR 48 ON 24 SLAVES ...

42 of 45

Page 43: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

IN SAME AMOUNT OF TIME !IN SAME AMOUNT OF TIME !

43 of 45

Page 44: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

INCREASE PRODUCTION BY ADDING CORESINCREASE PRODUCTION BY ADDING CORES

44 of 45

Page 45: SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING … · SATELLITE IMAGE ORTHORECTIFICATION OVER HDFS USING SPARK IRT, OCE PROJECT 1 of 45. ... (a cluster) Data is split into pieces

WHAT WE HAVE DONE :WHAT WE HAVE DONE :

Analysis of High Performance mono computer orthorectification algorithmFull implementation of scalable orthorectification algorithmDeployment on google cloud engineAnalysis of distribution performances

45 of 45