View
41
Download
0
Category
Tags:
Preview:
DESCRIPTION
BIFE Project Presentation. Totem Team Hongbin Li, Chenhao Wu, Bin Zhou (by alphabet). Outline. Project Brief Introduction Dashboard Demonstration Technical Description Teamwork Experience. Brief Project Description. Goal: Provide review and analysis in Olympic history. - PowerPoint PPT Presentation
Citation preview
BIFE Project Presentation
Totem TeamHongbin Li, Chenhao Wu, Bin Zhou (by alphabet)
Outline Project Brief Introduction Dashboard Demonstration Technical Description Teamwork Experience
Brief Project Description Goal:
Provide review and analysis in Olympic history. Evaluate Olympic promotion in country and sport
levels respectively. Seek and suggest opportunities for cooperation
among countries with common popular sports.
Brief Project Description Data source
Online database at http://www.databasesports.com/olympics/
Incomplete Erroneous Inconsistent
Brief Project Description
Warehouse A complete process of ETL and warehouse
construction. Scheme Design
Brief Project Description
Report and Dashboard Design 8 Attributes, 9 Facts, 30+ metrics. 30+ reports either for test or use.
Outline Project Brief Introduction Dashboard Demonstration Technical Description Teamwork Experience
Outline Project Brief Introduction Dashboard Demonstration Technical Description Teamwork Experience
Technical Description
Raw data extraction Web crawler developed in C# -> html files
Data preprocessing, LU table generation Done in text level using C# -> txt files
WH generation and data completion Done in db level using Java and SQL -> one mdb
file~30K html files
Two txt files
1 fact + 7 LU txt files
1 mdb file
Athlete level and Event
level
70K lines of medal records
Normalized ID
generatedReady for
MSTR Project Consistency Check
Redundancy Check
Technical Description Challenges Encountered
Html files with identical names contain different data Both files should be processed.
Inconsistent country abbreviations ICE&ISL, JPN&JAP, IRI&IRN -> manual correction.
Special Game history Figure skating was held in both summer and winter games -> counted as
winter. Erroneous data
Ridiculous age, 108, -56 -> Correction conducted, still doubtful. Athlete name ambiguity -> No feasible way to solve completely. Some results do not exist at all -> Currently no patch applied.
Heterogeneous result format: 10.22s, 18m, 127kg, 33:02, 2-1 Normalize into double type, keep only distance, weight, time and point
measures.
Technical Description
Challenges Encountered (cont’) Unable to render correct result with Selector and View
Filter settings. Aggregation for attributes from different Hierarchies is not
supported. Currently a prompt is employed.
Unable to combine two prompts in one dashboard for two slightly different datasets. Currently one dataset is dropped for better user experience.
Unable to count medal numbers both at athlete and event levels. Currently one table with no athlete info is duplicated for
medal counting at event level.
Outline Project Brief Introduction Dashboard Demonstration Technical Description Teamwork Experience
Teamwork Experience Everyone gets involved in each phase. In particular:
Bin initiated the data collection with his expertise in web search.
Chenhao and Hongbin jointly conducted data processing and WH construction.
Chenhao holds main credits in schema design. Pair work makes sense!
Thank you!
Recommended