Upload
gloria-hawkins
View
216
Download
1
Tags:
Embed Size (px)
Citation preview
eRecruiter Expert System
Presenters: Jonathan Musser, Maxwell Hallum, Jonathan Silliman, Wei Chen
Project websites: Google Project Site, Google Code Site
Agenda
1. Project Review
2. Project Progression
3. Meetings with experts
4. Bolts and Nuts
5. Demo and questions
Part 5-1 Project Review
Wei Chen
eRecruiter
Problem domain: an expert system that help judge a resume according to expert
knowledge As an expert system, our system implements:
Facts from resumes and expert knowledge Templates to define the structure of facts and knowledge Inference rules for scoring and weighting facts and making
decisions Explanation for explaining the results of judgments Linguistic fuzziness handled through natural language processing
Use cases of the system: Quickly create a pool of qualified resumes Rank resumes Judge an individual resume
Wei 1/5
System design: components
Facts generation 1 CLIPS Execution2
Explanation3
Wei 2/5
Step 3-1 Facts generation
wxPython and Python
Beautifulsoup, NLTK and Python
Wei 3/5
Step 3-2 CLIPS Execution
Python and PyCLIPS
Wei 4/5
Step 3-3 Explanation
Python and wxPython
Wei 5/5
Part 5-2 Project Progression
Jonathan Silliman
Team Meetings Met weekly on Fridays and occasionally on
Wednesday Discussion of system design
System architecture Scoring system
Assigned work for week Timeboxes Coding assignments
Jon S. 1/3
Project Progression• Researched the topic• Planned initial framework of the system
– Scoring on desired employee attributes• Initial rules and facts
– Leadership and Skills• Design for explanation system• Initial Resume Parser• Meeting with Rachel Ligman
– Reevaluate design– Focus on important facts in resume
Jon S. 2/3
Project Progression Created new rules and templates Developed UI Redeveloped resume parser Merged CLIPS rules, UI, and the resume parser Last meeting with Rachel Ligman
Jon S. 3/3
Part 5-3 Expert Meetings
Maxwell Hallum
Industry Partner-Steve Saunders Initial idea for project Obtained initial expectations/problem domain Formed our initial approach
Ranking resumes based off scores in 16 categories Must be able to handle multiple resumes Must allow for flexibility in judging based off
employers needs
Max 1/3
Course Instructor-Naeem Shareef Discussed many of our project ideas with him Helped us settle on the eRecruiter system
Max 2/3
Career Service Advisor-Rachel Ligman Works for OSU’s Engineering Career Services
office 1st meeting
Explained the actual employment screening process to us
Convinced us to look at a far smaller set of qualities
Formed our desired/required qualities approach 2nd meeting
Suggested removal of our Loyalty quality Seemed very enthusiastic when shown our system
Max 3/3
Part 5-1 Bolts and Nuts
Wei Chen and the Team
Work Division and Accomplishments Individual accomplishment:
Max: Skills, Certifications, Work Area, min/max degree rules
Jon M: Education, Leadership Experience, majors,
Normalization rules Jon S.
UI, Integration of CLIPS and resume parser, testing Wei:
Resume formatting, resume parsing, resume CLIPS facts generation
Shared accomplishments: Discussion on the overall design of the system Preparation of knowledge base Discussion on facts structure and inference rules Discussion on scoring strategy and explanation system Timebox, deliverables, expert contact and group meetings
Bolts and Nuts 3-1 Resume parsing and facts
generation
Wei Chen
NLTK• NLTK (natural language toolkit) is used to extract
resume facts based on linguistic patterns.– “(I) Worked on Ruby on Rails application creating
MySQL database.”– I/PRP worked/VBD on/IN Ruby/NNP on/IN Rails/NNS
application/NN creating/VBG MySQL/NNS database/NN ./.
– Facts extracted based on noun phrases:– Ruby on Rails application (NNP IN NNS NN)– MySQL database (NNS NN)
• Other linguistic patterns:• Word forms: e.g. lead, leader, leading• Word sense:e.g. (help, aid) (manage, organize)
Wei 1/5
HTML resume to CLIPS facts
HTML resum
e
Experience
Position Leadership quality
Experience description
Work area quality
DurationLoyalty quality
Skills Skill qualities
Certifications Certification qualities
Education
DegreeDegree quality
SchoolSchool rank
quality
Major Major quality
DOM root
DOM objects
Text area and attributes of objects
Wei 2/5
HTML structure……<div id="company1" title="ClearNet Security"> <div id="position11">Consultant</div> <div id="exp_time11">January 2010-April
2010</div> <div id="experience11">Worked on Ruby
on Rails application creating matching algorithms and UPC database.</div>
</div>……
Wei 3/5
Wei 4/5
Some coding conventions Resume facts CLIPS file is named uniquely as
ID_Name.clp
Each deffacts has a ID slot to uniquely identify a candidate
Wei 5/5
Bolts and Nuts 3-2User Interface
Jonathan Silliman
User Interface• Multiple purposes
– Get user input– Display Results– Interfaces CLIPS code
and resume parser• Easy access to
resumes and explanations
• Clear results• Flexible Input• Implemented using
wxPython
Jon S.1/3
User Inputs• Skills - Required and desired• Certifications - Required and desired• Degree Level – Required and maximum• Major – Required and desired• Grade point average – Required• Leadership Experience – Desired• Areas of Expertise – Required and Desired • Years Experience – Minimum and Max• Cutoffs for Skills, Certifications , and Leadership
Experience• Apply weights to Skills, Certifications, Degrees,
Leadership Experience and Areas of Expertise
Jon S.2/3
Explanation System Show contributing
score for areas Gives user friendly
explanations A mixture of
explanations from CLIPS and UI code
Jon S.3/3
Bolts and Nuts 3-3CLIPS Templates and Rules
Max Hallum, Jonathan Musser
Person
CLIPS TemplatesExplanation
ID
LeadershipExp
MonthsExperience
WorkAreaExp
Education
Degree
Skills
Certification
KeyWordOccurance
Fired (1 each)
Explanation, section
Position, level, company, duration
name, duration
months
Degree, degreelevel, major, GPA
Major
skill
name
Phrase, count
Name, Path, SkillScore, LeadershipScore, EducationScore,
DegreeScore, CertificationScore,
WorkAreaScore, TotalScore, Normalized,
fired TotalYearshighestDegree
School
Namequality
Max 1/4
OutsideWantedExp
TotalExp
LeadershipExp
MonthsExperience
WorkAreaExp
Education, Degree
Skills
Certification
Templates
Is-level3
Is-level2
Is-level1
Salience Level 0
MissingSkills
PresentSkills
DesiredSkill
MissingWorkAreas
general-DurExp
MissingCerts
DesiredCert
PresentCerts
Salience Level 1
Eliminate-
outside-
degreelevelsEliminate-
below-minGP
A
score-bachelors-school-Listed
score-bachelors-schoolNotListed
meetsDesiredMajor
score-masters-school-Listed
score-masters-school-notListed
Score-PhD-school-Listedscore-PhD-
school-Listed
CLIPS Rules
Max 2/4
Person
Template
Normalize (Salience
0)
Totaler (Salience -
1)
OutputScores (Salience -2)
CLIPS Rules Cont.
Max 3/4
Skills, Certifications, Work Area Experience
Very similar structure 1 rule to check for Desired skills
Skills/Certs gain 1 point for each, WAE gains duration of experience
2 rules to check for required skills 1 detects the presence of required skills (not in
Work Area Experience) 1 detects the absence of required skills Need to generate separate explanations No point change for presence, loses all points for
absence.
Max 4/4
CLIPS Rule Subdivisions Scoring Rules
Education Leadership Exp Desired Skills Desired Major Desired Certifications Work Area Experience
Minimum Requirement Rules Minimum GPA Required Skills Required
Certifications Required Major Minimum Degree Min/Max years worked
Jon M.1/6
Education/Major Rules Must first eliminate experiences below minimum
gpa and below minimum degree level – these will not be counted
Next check required/desired majors and score if desired
Finish by scoring the experience Function of school quality and gpa School quality comes from hard coded facts (US News and
World Reports Computer Science Rankings) Once at this point, deviations between education
experiences are minor – the requirements dominate
Jon M.2/6
Leadership Experience Rules Three categories
Level 1 “Presidential” Level – CEO, Director, etc.
Level 2 “Managerial” Level – Managers, lead engineer, etc.
Level 3 “General” Level – all remaining positions There is some level of leadership score associated with
each position
Jon M.3/6
Example of a Scoring Rule Execution
Formula:
LdrshpScr = LdrshpScr +
lvlscr * (duration/12)^(1/2)
Formula:
LdrshpScr = 0 +
10 * (60/12)^(1/2)
Formula:
LdrshpScr = 22.36 +
1 * (10/12)^(1/2)22.36
1
23.27
1
Jon M.4/6
Normalization
Once all fields scored, CLIPS portion will normalize scores for easier use by the GUI
Determines realistic maximum values for each subscore Example: Nobody could have higher skill score than
somebody earning every desired skill – make this max
Use these maximums to determine relative value of each score before scaling by the weighting factor input from the GUI In essence, make it out of 100%, multiply by weight and
sum
Jon M.5/6
Future Improvements
Enhanced, Intuitive UI Fine tune the scoring system More Robust System Expand the Domain Ability to parse resumes in their original
format
Jon M.6/6
DEMO
Jonathan Silliman and the eRecuiter team