Upload
priyanka-kale
View
323
Download
1
Embed Size (px)
Citation preview
Geolocation Data Analysis for Safe Residence using HiveQL
TEAM: PRIYANKA KALE, PRIYAL MISTRY, HITESH JAGTAP GUIDE: DR. JONGWOOK WOO
24th Annual Student Symposium, CSULA26th February 2016
Table of Contents1. Introduction
2. Big Data
3. Flowchart
4. Specifications
5. Implementation
6. Visualization
7. GitHub
8. Business Perspective
9. References
Introduction: Goal- To determine if a location is safe or not by analyzing huge
crime data (1.3 GB) for Chicago city in IL collected from 2001 to present(November 2015).
This is a study of real dataset provided by the government of United States of America using Big Data Analytics and related Tools.
Query output is visualized using different graphs and maps for better interpretation.
Big Data
Volume
Complexity
Variety
Variability
Flowchart
Download Dataset
Upload data into HDFS
Trigger Hive Queries
Result Tables
Output visualization
Specifications
• Microsoft Azure Hortonwork’s sandbox: 1. Linux system2. No. of nodes: 43. 8 cores4. Size-14 Gb
Implementation
Hue is a web application which helps to browse HDFS and work with Hive and Cloudera Impala queries, MapReduce jobs.
Creation of tables in Hcatalog:
Hive and Beeswax
Hive is an infrastructure built on top of Hadoop for data summarization, query and analysis
Beeswax an application to perform HIVE queries
Processing in Beeswax:
Total no and rank of crime type –
select primary_type, count(iucr), rank() over (ORDER BY count(iucr) desc) from crime group by primary_type limit
100;
Queries and Visualization
number of crime as per location type for a given area- select location_description, count(iucr) from crime where address = '008XX N MICHIGAN AVE' group by location_description limit 100;
0200400600800
10001200
Total
Total
Final Outcome of Analysis:CREATE TABLE UnsafeArea row format delimited fields terminated by ',' STORED AS RCFile AS select address,count(iucr) AS total_crimes,rank() over (ORDER BY count(iucr) desc) AS rank from crime GROUP BY address;
GitHub
URL: https://github.com/priya708/Project-520
Business Perspective Get better advertisement
Predictive Policing for Police department: The future of Law enforcement?
• Reducing Random Gunfire• Connecting Burglaries and Code Violations
References
https://catalog.data.gov
https://cwiki.apache.org/confluence/display/Hive/Tutorial
https://hortonworks.com/tutorials
THANK YOU