13
Analytical Toolbox for Smart City Applications: Garbage Collection Log Use Case Takahiro Komamizu , Jin Nakazawa , Toshiyuki Amagasa , Hiroyuki Kitagawa , Hideyuki Tokuda University of Tsukuba, Keio University Japan

Analytical Toolbox for Smart City Applications: Garbage Collection … · 2020-04-09 · Preparation: Implicit Order Join r0 r1 r2 r3 1 1 10 d 2 4 20 d 1 2 30 e 2 3 50 e rank of r1

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Analytical Toolbox for Smart City Applications: Garbage Collection … · 2020-04-09 · Preparation: Implicit Order Join r0 r1 r2 r3 1 1 10 d 2 4 20 d 1 2 30 e 2 3 50 e rank of r1

Analytical Toolbox for Smart City Applications: Garbage Collection Log Use Case

Takahiro Komamizu†, Jin Nakazawa‡, Toshiyuki Amagasa†, Hiroyuki Kitagawa†,

Hideyuki Tokuda‡

†University of Tsukuba, ‡Keio UniversityJapan

Page 2: Analytical Toolbox for Smart City Applications: Garbage Collection … · 2020-04-09 · Preparation: Implicit Order Join r0 r1 r2 r3 1 1 10 d 2 4 20 d 1 2 30 e 2 3 50 e rank of r1

Background

• Big Data + IoT + Smart City• Large amount of data from various resources• Data tends to be noisy or inconsistent.• Analytics is a feedback way to real-world

activities.• Analytics plays a critical role.• Data does not solely tell what to do next.

Page 3: Analytical Toolbox for Smart City Applications: Garbage Collection … · 2020-04-09 · Preparation: Implicit Order Join r0 r1 r2 r3 1 1 10 d 2 4 20 d 1 2 30 e 2 3 50 e rank of r1

Objective and Approach

• Objective•Making analytics easier• Handling noisy

data• Interactive

analysisPreparation

Data Integration Data Cleansing

AnalysisOLAP Stream Data Analysis

VisualizationInteractive I/F Spatio-temporal Vis.

Toolbox

RCar Course Area

A A-1 D1

A A-2 D2

B B-1 D3

B B-2 D1

C C-1 D2

C C-2 D3

SCar Weight Time

A 100 4/1/17 10:00

B 150 4/1/17 11:00

C 120 4/1/17 12:00

A 200 4/1/17 13:00

B 180 4/1/17 14:00

C 110 4/1/17 15:00

⋈ Car=Car

Collecting Car

Car Course Area

A A-1 D1

A A-2 D2

B B-1 D3

B B-2 D1

C C-1 D2

C C-2 D3

Car Course Area

A A-1 D1

A A-2 D2

B B-1 D3

B B-2 D1

C C-1 D2

C C-2 D3

Car Course Area

A A-1 D1

A A-2 D2

B B-1 D3

B B-2 D1

C C-1 D2

C C-2 D3

Routing Info.

Disambiguation & Join

Analytics

Interactive I/F

data insertion

user interaction

• Approach

Page 4: Analytical Toolbox for Smart City Applications: Garbage Collection … · 2020-04-09 · Preparation: Implicit Order Join r0 r1 r2 r3 1 1 10 d 2 4 20 d 1 2 30 e 2 3 50 e rank of r1

Our tools

• Preparation• Implicit order join: joining two data having

insufficient information (i.e., attributes)• Analysis• Stream OLAP*: OLAP analysis for both

streaming data and static data• Visualization• Geo-spatial visualization• Interactive interface

*OLAP = Online Analytical Processing

Page 5: Analytical Toolbox for Smart City Applications: Garbage Collection … · 2020-04-09 · Preparation: Implicit Order Join r0 r1 r2 r3 1 1 10 d 2 4 20 d 1 2 30 e 2 3 50 e rank of r1

Preparation:Implicit Order Join

r0 r1 r2 r31 1 10 d

2 4 20 d

1 2 30 e

2 3 50 e

rank of r1within r3 values

s1 s2 s3 s0d 300 1 1

d 200 3 2

e 200 2 1

e 400 4 2

rank of s3within s1 values

R+ S+⋈ "#$%&∧")$%)

Rs1 s2 s3d 300 1

e 200 2

d 200 3

e 400 4

Sr1 r2 r31 10 d

2 30 e

3 50 e

4 20 d

⋈ "#$%&

Expected join resultsUnexpected join results

• Give unnecessary join results cuz. Lack of info. to join.• Causes errors on

further analysis.

• Find implicit join key from feedbacks.• Order-oriented

correlation assumption.

Detail: [Komamizu et al., HMData@Big Data 2017]

Page 6: Analytical Toolbox for Smart City Applications: Garbage Collection … · 2020-04-09 · Preparation: Implicit Order Join r0 r1 r2 r3 1 1 10 d 2 4 20 d 1 2 30 e 2 3 50 e rank of r1

Analysis:Stream OLAP

StreamOLAP

User

Interactingfor OLAP

wastecollection

cityfacility

SNS traffic roadstate

...

...

Wrapper Wrapper Wrapper Wrapper Wrapper ......

Visual Interface Stream Processing Engine (SPE)

+OLAP Engine

Detail: [Nakabasami et al., Big Data 2015]

Wrapper converts hetero data into a common format (e.g., relational).

Page 7: Analytical Toolbox for Smart City Applications: Garbage Collection … · 2020-04-09 · Preparation: Implicit Order Join r0 r1 r2 r3 1 1 10 d 2 4 20 d 1 2 30 e 2 3 50 e rank of r1

Visualization:Geo-spatial Visualization

Point-by-Point relationship Trajectory and point visualization

Detail: [Komamizu et al., SmartCity 2016]

Page 8: Analytical Toolbox for Smart City Applications: Garbage Collection … · 2020-04-09 · Preparation: Implicit Order Join r0 r1 r2 r3 1 1 10 d 2 4 20 d 1 2 30 e 2 3 50 e rank of r1

Visualization:Interactive Interface

Changes over time

Seasonly

Weather

per area per areahouse-oriented

Page 9: Analytical Toolbox for Smart City Applications: Garbage Collection … · 2020-04-09 · Preparation: Implicit Order Join r0 r1 r2 r3 1 1 10 d 2 4 20 d 1 2 30 e 2 3 50 e rank of r1

Use Case: Garbage Collection in Fujisawa city• Logs of amount of garbage per collection• Car drives a route at a time.• Car drives several routes per day.

Table I: Schematic information of garbage collection logs.

Field Explanationslip number the number of a recordtimestamp date and time of arrival of the collecting carcompany type classification of company of the cargarbage type type of garbage (e.g., burnable)car id identifier of the cartotal weight overall weight of the car including garbagecar weight weight of the cargarbage weight total weight � car weightunit price price for garbage per kilogramprice total price for the garbageplant id identifier of the garbage treatment plant

Table II: Schematic information of course information.

Field Explanationcar id identifier of the carcourse id identifier of the coursedow day of the week when the car goesblock block where the course belongsnumber of houses the number of houses on the courseyoung or elder rough estimate classes (young or elder)

of average ages of the course

Other contents: Other related contents are obtainedfrom the web services and statistics. G1GP utilizes weatherrecords and population statistics and estimation. In thisapplication, weather records are obtained from downloadingsite2 of Japan Meteorological Agency3. The site providesvarious information related with weather such as max andmin temperature, humidity, weather description (e.g., fine,cloudy, rainy), and so on. For G1GP, weather descriptions ofrelevant days are obtained. Population statistics is obtainedfrom web site4 of National Institute of Population and SocialSecurity Research5.

B. Analytics

G1GP is deigned for motivating citizens for reducinggarbage wastes by revealing the facts about garbage amountsin Fujisawa city. G1GP consists of two basic interfaces: oneis comprehensive garbage analysis over the city, and theother is competitive analysis comparing analytical resultsbetween blocks in the city. Figure 4 shows selected snap-shots of G1GP in comprehensive and competitive analyses,respectively.

1) Comprehensive Analysis (Figure 4 (a)-(c)):Interactive Interface: The interactive interface (Fig-

ure 4(a)) visualizes histogram-based analytical results invarious conditions. There are six components on the inter-face. The right-most component displays the color paletterepresenting color assignments for blocks in the city. The left

2http://www.data.jma.go.jp/gmd/risk/obsdl/index.php3http://www.jma.go.jp/jma/index.html4http://www.ipss.go.jp/pp-shicyoson/j/shicyoson13/t-page.asp5http://www.ipss.go.jp/

five components show aggregated garbage amounts in time-periods (top-most), blocks (left-most), house-wise in blocks(middle), weather (right-up), and season (right-down). Userscan interact with the interface by clicking the bars orsetting the range of the time-periods. Figure 4(a) displaysselection of time-period as Jan. 2016 to Dec. 2016 at the topcomponent and rainy weather at right-up component. Theother results are displaying corresponding analytical results,such as garbage amounts per block in 2016 of rainy days inthe left-most component.

Time-series Analysis: The transitions of garbageamounts in all blocks (Figure 4(b)) depict overall tendencyof garbage amounts as well as different tendencies ofgarbage amount transitions in different blocks. This visu-alization incurs discussion on seasonal differences withineach block as well as among blocks. For instance, mostof the blocks reduce the amounts of garbage in Februaryand increase those from March to May. This is becauseApril is the switch of Japanese fiscal years, therefore manyevents occur such as movements of jobs, purchasing newfacilities and removing the old, etc., and these events incurslarge amounts of garbage. Also, users can see differencesof garbage amounts among blocks. Pair-wise comparisonbetween blocks is also provided as competitive analysis inG1GP shown later in this section.

Future Estimation: One of the important problem incities is future amounts of garbage because of limited spacefor treated garbage. G1GP estimates the future amounts ofgarbage using the estimates from population statistics pro-vided by the national institute. Figure 4(c) draws the estimateof garbage amounts in the future based on an assumptionthat the amount per a citizen does not change. By using‘young or elder’ field, G1GP provides rough estimates ofgarbage amounts on young people (blue line), elder people(black line), and overall (green line) respectively.

2) Competitive Analysis (Figure 4 (d)-(f)):Garbage Amount Comparison: The first competitive

analysis is a fundamental comparison (Figure 4(d)), thatis, comparisons of total garbage amounts (blue boxes) andhouse-wise garbage amounts (black boxes) in the two blocks.This visualization overviews the fundamental differences oftwo blocks. In particular, differences of house-wise garbageamounts indicate the two blocks have different characteris-tics for garbage wasting.

House-wise Garbage Amount Transition: Due to thedifferent characteristics of blocks based on seasons andpopulations, G1GP reveals time-series differences betweentwo blocks (Figure 4(e)). In order to fair comparison be-tween blocks, the amounts of garbage are normalized as thegarbage amounts per house. This visualization is expectedto reveal seasonal differences between two blocks. In addi-tion, the differences between blocks over time can indicatechanges related to garbage wasting such as campaign forgarbage reduction, and changes of garbage policies.

Table I: Schematic information of garbage collection logs.

Field Explanationslip number the number of a recordtimestamp date and time of arrival of the collecting carcompany type classification of company of the cargarbage type type of garbage (e.g., burnable)car id identifier of the cartotal weight overall weight of the car including garbagecar weight weight of the cargarbage weight total weight � car weightunit price price for garbage per kilogramprice total price for the garbageplant id identifier of the garbage treatment plant

Table II: Schematic information of course information.

Field Explanationcar id identifier of the carcourse id identifier of the coursedow day of the week when the car goesblock block where the course belongsnumber of houses the number of houses on the courseyoung or elder rough estimate classes (young or elder)

of average ages of the course

Other contents: Other related contents are obtainedfrom the web services and statistics. G1GP utilizes weatherrecords and population statistics and estimation. In thisapplication, weather records are obtained from downloadingsite2 of Japan Meteorological Agency3. The site providesvarious information related with weather such as max andmin temperature, humidity, weather description (e.g., fine,cloudy, rainy), and so on. For G1GP, weather descriptions ofrelevant days are obtained. Population statistics is obtainedfrom web site4 of National Institute of Population and SocialSecurity Research5.

B. Analytics

G1GP is deigned for motivating citizens for reducinggarbage wastes by revealing the facts about garbage amountsin Fujisawa city. G1GP consists of two basic interfaces: oneis comprehensive garbage analysis over the city, and theother is competitive analysis comparing analytical resultsbetween blocks in the city. Figure 4 shows selected snap-shots of G1GP in comprehensive and competitive analyses,respectively.

1) Comprehensive Analysis (Figure 4 (a)-(c)):Interactive Interface: The interactive interface (Fig-

ure 4(a)) visualizes histogram-based analytical results invarious conditions. There are six components on the inter-face. The right-most component displays the color paletterepresenting color assignments for blocks in the city. The left

2http://www.data.jma.go.jp/gmd/risk/obsdl/index.php3http://www.jma.go.jp/jma/index.html4http://www.ipss.go.jp/pp-shicyoson/j/shicyoson13/t-page.asp5http://www.ipss.go.jp/

five components show aggregated garbage amounts in time-periods (top-most), blocks (left-most), house-wise in blocks(middle), weather (right-up), and season (right-down). Userscan interact with the interface by clicking the bars orsetting the range of the time-periods. Figure 4(a) displaysselection of time-period as Jan. 2016 to Dec. 2016 at the topcomponent and rainy weather at right-up component. Theother results are displaying corresponding analytical results,such as garbage amounts per block in 2016 of rainy days inthe left-most component.

Time-series Analysis: The transitions of garbageamounts in all blocks (Figure 4(b)) depict overall tendencyof garbage amounts as well as different tendencies ofgarbage amount transitions in different blocks. This visu-alization incurs discussion on seasonal differences withineach block as well as among blocks. For instance, mostof the blocks reduce the amounts of garbage in Februaryand increase those from March to May. This is becauseApril is the switch of Japanese fiscal years, therefore manyevents occur such as movements of jobs, purchasing newfacilities and removing the old, etc., and these events incurslarge amounts of garbage. Also, users can see differencesof garbage amounts among blocks. Pair-wise comparisonbetween blocks is also provided as competitive analysis inG1GP shown later in this section.

Future Estimation: One of the important problem incities is future amounts of garbage because of limited spacefor treated garbage. G1GP estimates the future amounts ofgarbage using the estimates from population statistics pro-vided by the national institute. Figure 4(c) draws the estimateof garbage amounts in the future based on an assumptionthat the amount per a citizen does not change. By using‘young or elder’ field, G1GP provides rough estimates ofgarbage amounts on young people (blue line), elder people(black line), and overall (green line) respectively.

2) Competitive Analysis (Figure 4 (d)-(f)):Garbage Amount Comparison: The first competitive

analysis is a fundamental comparison (Figure 4(d)), thatis, comparisons of total garbage amounts (blue boxes) andhouse-wise garbage amounts (black boxes) in the two blocks.This visualization overviews the fundamental differences oftwo blocks. In particular, differences of house-wise garbageamounts indicate the two blocks have different characteris-tics for garbage wasting.

House-wise Garbage Amount Transition: Due to thedifferent characteristics of blocks based on seasons andpopulations, G1GP reveals time-series differences betweentwo blocks (Figure 4(e)). In order to fair comparison be-tween blocks, the amounts of garbage are normalized as thegarbage amounts per house. This visualization is expectedto reveal seasonal differences between two blocks. In addi-tion, the differences between blocks over time can indicatechanges related to garbage wasting such as campaign forgarbage reduction, and changes of garbage policies.

Garbage Collection Logs Collection Routing

Page 10: Analytical Toolbox for Smart City Applications: Garbage Collection … · 2020-04-09 · Preparation: Implicit Order Join r0 r1 r2 r3 1 1 10 d 2 4 20 d 1 2 30 e 2 3 50 e rank of r1

Preparation:Successfully Joins

R+

Car Course Area y

A A-1 D1 1

A A-2 D2 2

B B-1 D3 1

B B-2 D1 2

C C-1 D2 1

C C-2 D3 2

S+

x Car Weight Time

1 A 100 4/1/17 10:00

1 B 150 4/1/17 11:00

1 C 120 4/1/17 12:00

2 A 200 4/1/17 13:00

2 B 180 4/1/17 14:00

2 C 110 4/1/17 15:00

⋈ "#$%"#$∧(%)

Implicit keys

0 500000 1000000 1500000 2000000 2500000 3000000

Costs

Naive

ProposedExpected

Joined

#Joined tuples

Page 11: Analytical Toolbox for Smart City Applications: Garbage Collection … · 2020-04-09 · Preparation: Implicit Order Join r0 r1 r2 r3 1 1 10 d 2 4 20 d 1 2 30 e 2 3 50 e rank of r1

Analytic Scenarios:Comprehensive Analysis

Monthly garbage weight

Mon

thly

garb

age

wei

ght

A B C D E F G HI J K L M

Garbage estimate

Ann

ual g

arba

ge w

eigh

t

kt

kt

kt

kt

kt

kt

kt

Year

Young Elder All

↑ Monthly change per area.↖ Rainy day in 2016.← Future estimates.etc.

Page 12: Analytical Toolbox for Smart City Applications: Garbage Collection … · 2020-04-09 · Preparation: Implicit Order Join r0 r1 r2 r3 1 1 10 d 2 4 20 d 1 2 30 e 2 3 50 e rank of r1

Analytic Scenarios:Comparative Analysis

Monthly garbage weight

MonthA B

Hou

se-w

ise g

arba

ge w

eigh

t

A B

Comparison of garbage weights

Tota

l gar

bage

wei

ghts

House-w

ise garbage weights

Total House-wise

↑ Garbage amount comparisonin total and in house-wiseperspectives.

↑ Monthly changes of garbageamounts on different areas.

Page 13: Analytical Toolbox for Smart City Applications: Garbage Collection … · 2020-04-09 · Preparation: Implicit Order Join r0 r1 r2 r3 1 1 10 d 2 4 20 d 1 2 30 e 2 3 50 e rank of r1

Conclusion

PreparationData Integration Data Cleansing

AnalysisOLAP Stream Data Analysis

VisualizationInteractive I/F Spatio-temporal Vis.

Toolbox

RCar Course Area

A A-1 D1

A A-2 D2

B B-1 D3

B B-2 D1

C C-1 D2

C C-2 D3

SCar Weight Time

A 100 4/1/17 10:00

B 150 4/1/17 11:00

C 120 4/1/17 12:00

A 200 4/1/17 13:00

B 180 4/1/17 14:00

C 110 4/1/17 15:00

⋈ Car=Car

Collecting Car

Car Course Area

A A-1 D1

A A-2 D2

B B-1 D3

B B-2 D1

C C-1 D2

C C-2 D3

Car Course Area

A A-1 D1

A A-2 D2

B B-1 D3

B B-2 D1

C C-1 D2

C C-2 D3

Car Course Area

A A-1 D1

A A-2 D2

B B-1 D3

B B-2 D1

C C-1 D2

C C-2 D3

Routing Info.

Disambiguation & Join

Analytics

Interactive I/F

data insertion

user interaction