Upload
priyanka-kale
View
211
Download
0
Embed Size (px)
Citation preview
Revenue & employment Analysis of International Students in USA
Team Members: Priyanka Kale, Apekshit Bhingardive, Aditya VermaGuide: Dr. Jongwook Woo
24th Annual Student Symposium, CSULA26th February 2016
What is Big Data?
Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis.
It's not the amount of data that's important. It's what we do with the data that matters.
Machine Learning: big data often doesn't ask why and simply detects patterns.
Digital footprint: big data is often a cost-free byproduct of digital interaction.
Purpose of Analysis
To develop a system which will assist us to determine the revenue generated by international students.
Examining the relationship between new international enrollments and institutional income at public colleges, universities and professional organizations in the US.
Continued..
To understand the effects of increased international student enrollment on net revenue generation in US
Find out the income from Universities
Predict the impact of international students on revenue generation
Predict employment opportunities in the US
• Basic formula for calculating economic Benefit
Analysis is done using:
Analysis on huge data is done using the Hadoop File system (HDFS)
Hadoop environment using Horton Sandbox on Azure
Using Python and HIVE [Pyhive] – iPython Notebook
HUE
Google Fusion tables
WEKA Framework
Loading data into HDFS: File has been uploaded using Hadoop command line
Interface
Hortonworks Sandbox configuration
Number of nodes: 3 Size : Basic A4 with 8 cores 14 Gb memory
Creating tables in HUE from existing data
Connecting HIVE through Python Using Ipython notebook for writing the python
code
Embedding HiveQL inside python code.
Executing the Hive script from python code:
Visualizing data with Graphs
Alabam
a
Alask
a
Arizon
a
Arkan
sas
Califo
rnia
Color
ado
Connec
ticut
Delawar
e
Distric
t of C
olumbia
Feder
ated
State
s of M
icron
esia
Florid
a
Georg
iaGua
mHaw
aii
Idaho
Illinois
Indian
aIow
a
Kansa
s
Kentu
cky
Louisi
anaMain
e
Marsh
all Is
lands
Maryla
nd
Massa
chus
etts
Michiga
n
Minnes
ota
Mississ
ippi
Missou
ri
Monta
na
Nebra
ska
Nevad
a
New H
amps
hire
New Je
rsey
New M
exico
New Yor
k
North
Caro
lina
North
Dak
otaOhio
Oklaho
ma
Oregon
Palau
Pennsy
lvania
Puerto
Rico
Rhode I
sland
South
Caroli
na
South
Dak
ota
Tenn
esse
eTe
xas
$0.00
$5,000,000,000.00
$10,000,000,000.00
$15,000,000,000.00
$20,000,000,000.00
$25,000,000,000.00
TOTAL EARNING FROM FEES
Major earning states
California; 9.55%
New York; 10.84%
Pennsylvania; 7.36%
Percentage of total income
CaliforniaNew YorkPennsylvania
Visualizing Data in Google Fusion Tables
Supervised Learning using Classification:
WEKA framework has been used to classify the states depending on there total value of earnings.
UserClassifier Algorithm provided by WEKA tool has been used to generate graph of classification.
Final outcome of the Hive script executed in python has been processed using above mentioned algorithm.
Continued.. The class color differentiate the states into categories : For instance New York lies in orange color zone with being the among the top revenue generating state
Value Proposition:
International Students mobility trends: By 2017, the global middle class is projected to increase its spending on educational products and services by nearly 50 percent.
Institutions can take this growth into consideration!
United States a more welcoming nation!
Predictive Modelling:
Employment Analysis – How ? Finding data where international student work after their graduation
Based on the number students employed in current and past years
Number of employers hiring international students in every filed of the grad study [Job positions]
References :
https://nces.ed.gov/ipeds/datacenter/
https://github.com/priya708/Project-528
https://gitlab.com/Addylad/Project528BigData/tree/47b3e6469bff4e9b7cbe0d743cb8ad9520dbb786/DataSource
https://cwiki.apache.org/confluence/display/Hive/Tutorial
https://hortonworks.com/tutorials
http://www.nafsa.org/
Thank You!