Upload
bobbyleekim
View
106
Download
0
Tags:
Embed Size (px)
Citation preview
FundAClassroom.todayInspire your supporters on DonorsChoose.org
Bobby Kim, Fellow at Insight Data Science
What is the problem?
• In 2013-2014, US teachers on average $513 out-of-pocket for their classrooms.1
• DonorsChoose.org, an online crowdfunded charity is helping reduce the burden.
• Can we predict whether a project will be funded or not based simply on a teacher’s essay?
1. http://www.forbes.com/sites/nicoleleinbachreyhle/2014/08/19/teachers-spend-own-money-school-supplies/
Supervised learning – binary classification
• Data: CSV dataset from DonorsChoose.org
• ~200,000 essays going back to 2012 – 75/25 Funded/Not Funded
• Vectorize DonorsChoose essays using tf-idf, build vocabulary of 4000 words
• Model: L2 Logistic Regression
• Validation:
• 5-fold cross validation for model tuning using training set (90%)
• ROC AUC using test set (10%)
About me• PhD in Computational Biophysics, Rice University
• Built models for protein folding simulations using experimental protein structures and sequence data
Future Directions• Feature Engineering using NLP
• TextBlob – sentiment analysis, lemmatization, parts of speech tagging, misspelling
• TextSTAT – reading level, subjectivity
• Supervised learning
• SVMs
Essay Format• Paragraph 1 – Open with the challenge facing your
students.
• Paragraph 2 – Tell us more about your students.
• Paragraph 3 – Inspire your potential donors with an overview of the resources you’re requesting
• Paragraph 4 – Close by sharing why your project is so important
Data Story