Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
Analytical Toolbox for Smart City Applications: Garbage Collection Log Use Case
Takahiro Komamizu†, Jin Nakazawa‡, Toshiyuki Amagasa†, Hiroyuki Kitagawa†,
Hideyuki Tokuda‡
†University of Tsukuba, ‡Keio UniversityJapan
Background
• Big Data + IoT + Smart City• Large amount of data from various resources• Data tends to be noisy or inconsistent.• Analytics is a feedback way to real-world
activities.• Analytics plays a critical role.• Data does not solely tell what to do next.
Objective and Approach
• Objective•Making analytics easier• Handling noisy
data• Interactive
analysisPreparation
Data Integration Data Cleansing
AnalysisOLAP Stream Data Analysis
VisualizationInteractive I/F Spatio-temporal Vis.
Toolbox
RCar Course Area
A A-1 D1
A A-2 D2
B B-1 D3
B B-2 D1
C C-1 D2
C C-2 D3
SCar Weight Time
A 100 4/1/17 10:00
B 150 4/1/17 11:00
C 120 4/1/17 12:00
A 200 4/1/17 13:00
B 180 4/1/17 14:00
C 110 4/1/17 15:00
⋈ Car=Car
Collecting Car
Car Course Area
A A-1 D1
A A-2 D2
B B-1 D3
B B-2 D1
C C-1 D2
C C-2 D3
Car Course Area
A A-1 D1
A A-2 D2
B B-1 D3
B B-2 D1
C C-1 D2
C C-2 D3
Car Course Area
A A-1 D1
A A-2 D2
B B-1 D3
B B-2 D1
C C-1 D2
C C-2 D3
Routing Info.
Disambiguation & Join
Analytics
Interactive I/F
data insertion
user interaction
• Approach
Our tools
• Preparation• Implicit order join: joining two data having
insufficient information (i.e., attributes)• Analysis• Stream OLAP*: OLAP analysis for both
streaming data and static data• Visualization• Geo-spatial visualization• Interactive interface
*OLAP = Online Analytical Processing
Preparation:Implicit Order Join
r0 r1 r2 r31 1 10 d
2 4 20 d
1 2 30 e
2 3 50 e
rank of r1within r3 values
s1 s2 s3 s0d 300 1 1
d 200 3 2
e 200 2 1
e 400 4 2
rank of s3within s1 values
R+ S+⋈ "#$%&∧")$%)
Rs1 s2 s3d 300 1
e 200 2
d 200 3
e 400 4
Sr1 r2 r31 10 d
2 30 e
3 50 e
4 20 d
⋈ "#$%&
Expected join resultsUnexpected join results
• Give unnecessary join results cuz. Lack of info. to join.• Causes errors on
further analysis.
• Find implicit join key from feedbacks.• Order-oriented
correlation assumption.
Detail: [Komamizu et al., HMData@Big Data 2017]
Analysis:Stream OLAP
StreamOLAP
User
Interactingfor OLAP
wastecollection
cityfacility
SNS traffic roadstate
...
...
Wrapper Wrapper Wrapper Wrapper Wrapper ......
Visual Interface Stream Processing Engine (SPE)
+OLAP Engine
Detail: [Nakabasami et al., Big Data 2015]
Wrapper converts hetero data into a common format (e.g., relational).
Visualization:Geo-spatial Visualization
Point-by-Point relationship Trajectory and point visualization
Detail: [Komamizu et al., SmartCity 2016]
Visualization:Interactive Interface
Changes over time
Seasonly
Weather
per area per areahouse-oriented
Use Case: Garbage Collection in Fujisawa city• Logs of amount of garbage per collection• Car drives a route at a time.• Car drives several routes per day.
Table I: Schematic information of garbage collection logs.
Field Explanationslip number the number of a recordtimestamp date and time of arrival of the collecting carcompany type classification of company of the cargarbage type type of garbage (e.g., burnable)car id identifier of the cartotal weight overall weight of the car including garbagecar weight weight of the cargarbage weight total weight � car weightunit price price for garbage per kilogramprice total price for the garbageplant id identifier of the garbage treatment plant
Table II: Schematic information of course information.
Field Explanationcar id identifier of the carcourse id identifier of the coursedow day of the week when the car goesblock block where the course belongsnumber of houses the number of houses on the courseyoung or elder rough estimate classes (young or elder)
of average ages of the course
Other contents: Other related contents are obtainedfrom the web services and statistics. G1GP utilizes weatherrecords and population statistics and estimation. In thisapplication, weather records are obtained from downloadingsite2 of Japan Meteorological Agency3. The site providesvarious information related with weather such as max andmin temperature, humidity, weather description (e.g., fine,cloudy, rainy), and so on. For G1GP, weather descriptions ofrelevant days are obtained. Population statistics is obtainedfrom web site4 of National Institute of Population and SocialSecurity Research5.
B. Analytics
G1GP is deigned for motivating citizens for reducinggarbage wastes by revealing the facts about garbage amountsin Fujisawa city. G1GP consists of two basic interfaces: oneis comprehensive garbage analysis over the city, and theother is competitive analysis comparing analytical resultsbetween blocks in the city. Figure 4 shows selected snap-shots of G1GP in comprehensive and competitive analyses,respectively.
1) Comprehensive Analysis (Figure 4 (a)-(c)):Interactive Interface: The interactive interface (Fig-
ure 4(a)) visualizes histogram-based analytical results invarious conditions. There are six components on the inter-face. The right-most component displays the color paletterepresenting color assignments for blocks in the city. The left
2http://www.data.jma.go.jp/gmd/risk/obsdl/index.php3http://www.jma.go.jp/jma/index.html4http://www.ipss.go.jp/pp-shicyoson/j/shicyoson13/t-page.asp5http://www.ipss.go.jp/
five components show aggregated garbage amounts in time-periods (top-most), blocks (left-most), house-wise in blocks(middle), weather (right-up), and season (right-down). Userscan interact with the interface by clicking the bars orsetting the range of the time-periods. Figure 4(a) displaysselection of time-period as Jan. 2016 to Dec. 2016 at the topcomponent and rainy weather at right-up component. Theother results are displaying corresponding analytical results,such as garbage amounts per block in 2016 of rainy days inthe left-most component.
Time-series Analysis: The transitions of garbageamounts in all blocks (Figure 4(b)) depict overall tendencyof garbage amounts as well as different tendencies ofgarbage amount transitions in different blocks. This visu-alization incurs discussion on seasonal differences withineach block as well as among blocks. For instance, mostof the blocks reduce the amounts of garbage in Februaryand increase those from March to May. This is becauseApril is the switch of Japanese fiscal years, therefore manyevents occur such as movements of jobs, purchasing newfacilities and removing the old, etc., and these events incurslarge amounts of garbage. Also, users can see differencesof garbage amounts among blocks. Pair-wise comparisonbetween blocks is also provided as competitive analysis inG1GP shown later in this section.
Future Estimation: One of the important problem incities is future amounts of garbage because of limited spacefor treated garbage. G1GP estimates the future amounts ofgarbage using the estimates from population statistics pro-vided by the national institute. Figure 4(c) draws the estimateof garbage amounts in the future based on an assumptionthat the amount per a citizen does not change. By using‘young or elder’ field, G1GP provides rough estimates ofgarbage amounts on young people (blue line), elder people(black line), and overall (green line) respectively.
2) Competitive Analysis (Figure 4 (d)-(f)):Garbage Amount Comparison: The first competitive
analysis is a fundamental comparison (Figure 4(d)), thatis, comparisons of total garbage amounts (blue boxes) andhouse-wise garbage amounts (black boxes) in the two blocks.This visualization overviews the fundamental differences oftwo blocks. In particular, differences of house-wise garbageamounts indicate the two blocks have different characteris-tics for garbage wasting.
House-wise Garbage Amount Transition: Due to thedifferent characteristics of blocks based on seasons andpopulations, G1GP reveals time-series differences betweentwo blocks (Figure 4(e)). In order to fair comparison be-tween blocks, the amounts of garbage are normalized as thegarbage amounts per house. This visualization is expectedto reveal seasonal differences between two blocks. In addi-tion, the differences between blocks over time can indicatechanges related to garbage wasting such as campaign forgarbage reduction, and changes of garbage policies.
Table I: Schematic information of garbage collection logs.
Field Explanationslip number the number of a recordtimestamp date and time of arrival of the collecting carcompany type classification of company of the cargarbage type type of garbage (e.g., burnable)car id identifier of the cartotal weight overall weight of the car including garbagecar weight weight of the cargarbage weight total weight � car weightunit price price for garbage per kilogramprice total price for the garbageplant id identifier of the garbage treatment plant
Table II: Schematic information of course information.
Field Explanationcar id identifier of the carcourse id identifier of the coursedow day of the week when the car goesblock block where the course belongsnumber of houses the number of houses on the courseyoung or elder rough estimate classes (young or elder)
of average ages of the course
Other contents: Other related contents are obtainedfrom the web services and statistics. G1GP utilizes weatherrecords and population statistics and estimation. In thisapplication, weather records are obtained from downloadingsite2 of Japan Meteorological Agency3. The site providesvarious information related with weather such as max andmin temperature, humidity, weather description (e.g., fine,cloudy, rainy), and so on. For G1GP, weather descriptions ofrelevant days are obtained. Population statistics is obtainedfrom web site4 of National Institute of Population and SocialSecurity Research5.
B. Analytics
G1GP is deigned for motivating citizens for reducinggarbage wastes by revealing the facts about garbage amountsin Fujisawa city. G1GP consists of two basic interfaces: oneis comprehensive garbage analysis over the city, and theother is competitive analysis comparing analytical resultsbetween blocks in the city. Figure 4 shows selected snap-shots of G1GP in comprehensive and competitive analyses,respectively.
1) Comprehensive Analysis (Figure 4 (a)-(c)):Interactive Interface: The interactive interface (Fig-
ure 4(a)) visualizes histogram-based analytical results invarious conditions. There are six components on the inter-face. The right-most component displays the color paletterepresenting color assignments for blocks in the city. The left
2http://www.data.jma.go.jp/gmd/risk/obsdl/index.php3http://www.jma.go.jp/jma/index.html4http://www.ipss.go.jp/pp-shicyoson/j/shicyoson13/t-page.asp5http://www.ipss.go.jp/
five components show aggregated garbage amounts in time-periods (top-most), blocks (left-most), house-wise in blocks(middle), weather (right-up), and season (right-down). Userscan interact with the interface by clicking the bars orsetting the range of the time-periods. Figure 4(a) displaysselection of time-period as Jan. 2016 to Dec. 2016 at the topcomponent and rainy weather at right-up component. Theother results are displaying corresponding analytical results,such as garbage amounts per block in 2016 of rainy days inthe left-most component.
Time-series Analysis: The transitions of garbageamounts in all blocks (Figure 4(b)) depict overall tendencyof garbage amounts as well as different tendencies ofgarbage amount transitions in different blocks. This visu-alization incurs discussion on seasonal differences withineach block as well as among blocks. For instance, mostof the blocks reduce the amounts of garbage in Februaryand increase those from March to May. This is becauseApril is the switch of Japanese fiscal years, therefore manyevents occur such as movements of jobs, purchasing newfacilities and removing the old, etc., and these events incurslarge amounts of garbage. Also, users can see differencesof garbage amounts among blocks. Pair-wise comparisonbetween blocks is also provided as competitive analysis inG1GP shown later in this section.
Future Estimation: One of the important problem incities is future amounts of garbage because of limited spacefor treated garbage. G1GP estimates the future amounts ofgarbage using the estimates from population statistics pro-vided by the national institute. Figure 4(c) draws the estimateof garbage amounts in the future based on an assumptionthat the amount per a citizen does not change. By using‘young or elder’ field, G1GP provides rough estimates ofgarbage amounts on young people (blue line), elder people(black line), and overall (green line) respectively.
2) Competitive Analysis (Figure 4 (d)-(f)):Garbage Amount Comparison: The first competitive
analysis is a fundamental comparison (Figure 4(d)), thatis, comparisons of total garbage amounts (blue boxes) andhouse-wise garbage amounts (black boxes) in the two blocks.This visualization overviews the fundamental differences oftwo blocks. In particular, differences of house-wise garbageamounts indicate the two blocks have different characteris-tics for garbage wasting.
House-wise Garbage Amount Transition: Due to thedifferent characteristics of blocks based on seasons andpopulations, G1GP reveals time-series differences betweentwo blocks (Figure 4(e)). In order to fair comparison be-tween blocks, the amounts of garbage are normalized as thegarbage amounts per house. This visualization is expectedto reveal seasonal differences between two blocks. In addi-tion, the differences between blocks over time can indicatechanges related to garbage wasting such as campaign forgarbage reduction, and changes of garbage policies.
Garbage Collection Logs Collection Routing
Preparation:Successfully Joins
R+
Car Course Area y
A A-1 D1 1
A A-2 D2 2
B B-1 D3 1
B B-2 D1 2
C C-1 D2 1
C C-2 D3 2
S+
x Car Weight Time
1 A 100 4/1/17 10:00
1 B 150 4/1/17 11:00
1 C 120 4/1/17 12:00
2 A 200 4/1/17 13:00
2 B 180 4/1/17 14:00
2 C 110 4/1/17 15:00
⋈ "#$%"#$∧(%)
Implicit keys
0 500000 1000000 1500000 2000000 2500000 3000000
Costs
Naive
ProposedExpected
Joined
#Joined tuples
Analytic Scenarios:Comprehensive Analysis
Monthly garbage weight
Mon
thly
garb
age
wei
ght
A B C D E F G HI J K L M
Garbage estimate
Ann
ual g
arba
ge w
eigh
t
kt
kt
kt
kt
kt
kt
kt
Year
Young Elder All
↑ Monthly change per area.↖ Rainy day in 2016.← Future estimates.etc.
Analytic Scenarios:Comparative Analysis
Monthly garbage weight
MonthA B
Hou
se-w
ise g
arba
ge w
eigh
t
A B
Comparison of garbage weights
Tota
l gar
bage
wei
ghts
House-w
ise garbage weights
Total House-wise
↑ Garbage amount comparisonin total and in house-wiseperspectives.
↑ Monthly changes of garbageamounts on different areas.
Conclusion
PreparationData Integration Data Cleansing
AnalysisOLAP Stream Data Analysis
VisualizationInteractive I/F Spatio-temporal Vis.
Toolbox
RCar Course Area
A A-1 D1
A A-2 D2
B B-1 D3
B B-2 D1
C C-1 D2
C C-2 D3
SCar Weight Time
A 100 4/1/17 10:00
B 150 4/1/17 11:00
C 120 4/1/17 12:00
A 200 4/1/17 13:00
B 180 4/1/17 14:00
C 110 4/1/17 15:00
⋈ Car=Car
Collecting Car
Car Course Area
A A-1 D1
A A-2 D2
B B-1 D3
B B-2 D1
C C-1 D2
C C-2 D3
Car Course Area
A A-1 D1
A A-2 D2
B B-1 D3
B B-2 D1
C C-1 D2
C C-2 D3
Car Course Area
A A-1 D1
A A-2 D2
B B-1 D3
B B-2 D1
C C-1 D2
C C-2 D3
Routing Info.
Disambiguation & Join
Analytics
Interactive I/F
data insertion
user interaction