21
Project acronym: EDSA Project full name: European Data Science Academy Grant agreement no: 643937 D1.5 Demand and supply analysis report Deliverable Editor: Alexander Mikroyannidis (OU) Other contributors: Inna Novalija (JSI), Erik Novak (JSI), Simon Scerri (Fraunhofer), Diego Collarana Vargas (Fraunhofer) Deliverable Reviewers: Joos Buijs (TUe), Huw Fryer (SOTON) Deliverable due date: 31/01/2018 Submission date: 26/01/2018 Distribution level: P Version: 1.0 This document is part of a research project funded by the Horizon 2020 Framework Programme of the European Union

D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

Projectacronym: EDSA

Projectfullname: EuropeanDataScienceAcademy

Grantagreementno: 643937

D1.5Demandandsupplyanalysisreport

DeliverableEditor: AlexanderMikroyannidis(OU)

Othercontributors:InnaNovalija(JSI),ErikNovak(JSI),SimonScerri(Fraunhofer),DiegoCollaranaVargas(Fraunhofer)

DeliverableReviewers:

JoosBuijs(TUe),HuwFryer(SOTON)

Deliverableduedate: 31/01/2018

Submissiondate: 26/01/2018

Distributionlevel: P

Version: 1.0

ThisdocumentispartofaresearchprojectfundedbytheHorizon2020FrameworkProgrammeoftheEuropeanUnion

Page 2: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

ChangeLog

Version Date Amendedby Changes

0.1 20/11/2017 AlexanderMikroyannidis

Outlineandresponsibilitiesofcontributors.

0.2 11/12/2017 ErikNovak,InnaNovalija

Contentinput.

0.3 10/01/2018 DiegoCollarana,SimonScerri

Contentinput.

0.4 11/01/2018 AlexanderMikroyannidis

Versionforinternalreview.

0.5 25/01/2018 AlexanderMikroyannidis

Revisedversion.

1.0 26/01/2018 AlexanderMikroyannidis

FinalQA.

Page 3: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

D1.5DemandandsupplyanalysisreportPage3of21

2018©Copyrightlieswiththerespectiveauthorsandtheirinstitutions.

TableofContents

ChangeLog............................................................................................................................................................................................2 TableofContents...............................................................................................................................................................................3 ListofTables........................................................................................................................................................................................3 ListofFigures......................................................................................................................................................................................3 1. ExecutiveSummary...............................................................................................................................................................5 2. Introduction..............................................................................................................................................................................6 3. Dashboarduserinterface...................................................................................................................................................6 4. Dashboardsoftwarearchitecture...............................................................................................................................10 4.1 Harvestingjobs..........................................................................................................................................................12 4.2 Harvestingskills.......................................................................................................................................................14 4.3 Harvestingcourses..................................................................................................................................................15

5. Dashboardevaluation.......................................................................................................................................................16 5.1 EvaluationGoals.......................................................................................................................................................16 5.2 EvaluationSetup.......................................................................................................................................................17 5.3 EvaluationResultsandDiscussion..................................................................................................................18

6. Conclusion...............................................................................................................................................................................21 7. References...............................................................................................................................................................................21

ListofTablesTable1:Summaryofdemandsources.-----------------------------------------------------------------------------13 Table2:Summaryofsupplysources.------------------------------------------------------------------------------15

ListofFiguresFigure1:JobsearchintheEDSAdashboard.----------------------------------------------------------------------7 Figure2:Jobsearchwiththemapviewselected.-----------------------------------------------------------------8 Figure3:Jobsearchwiththecoursesviewselected.------------------------------------------------------------8 Figure4:Thedashboardsearchfilters.-----------------------------------------------------------------------------9 Figure5:DatasciencelearningpathwaysintheEDSAcoursesportal.--------------------------------------9 Figure6:Buildingapersonalisedlearningpathway.-----------------------------------------------------------10 Figure7:Exampleofajobpostingreturnedbythequery“machinelearning”.EventhoughthejobpostingiswritteninSpanishthemethodologyfindsitrelevant.--------------------------------------------11

Page 4: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

Page4of21EDSAGrantAgreementno.643937

Figure8:Ontheleftthetenmostdemandedskillshistogram,andontherightthenumberofjobpositionstimeline,forthequery“machinelearning”.Hoveringoverthehistogramcolumnshowsthenumberofqueriedjobsdemandingtheskill.---------------------------------------------------------------------11 Figure9:TheEDSADataacquisitionprocess.--------------------------------------------------------------------12 Figure10:Topfifteencountrieswithmostfoundjobpostings.ThegreatestnumberofjobpostingswerefoundforUK,followedbyFranceandGermany.---------------------------------------------------------14 Figure11:Topfifteenmostdemandedskills.Theyaremostlycomprisedofhigh-levelskills,suchas“database”and“computerscience”,andspecificprogramminglanguages.-------------------------------14 Figure12:Thedistributionofonlinecoursesbycourseproviders.ThemostcourseswereacquiredfromCoursera,followedbyHackr.io.-------------------------------------------------------------------------------16 Figure13:TheNASATaskLoadIndexusedduringtheDashboardusabilityevaluation.--------------17 Figure14:ThePost-StudyUsabilityQuestionnaireusedduringtheDashboardusabilityevaluation.18 Figure15:WorkloadanalyticsfromTask1(T1)-Task5(5).Mean(M)andStandarddeviation(STD)arecalculated.Greentonesshowpositivelylowestratings,andredtonestheoppositehigherratings.------------------------------------------------------------------------------------------------------------------------------19 Figure16:Boxplotofself-perceivedsuccessinaccomplishingeachtask,accordingtoNASALTXresults.---------------------------------------------------------------------------------------------------------------------19 Figure17:Boxplotofhowinsecure,discouraged,irritated,stressed,andannoyedweretheparticipantsduringthetaskexecution,accordingtoNASALTXresults.-----------------------------------20 Figure18:Post-StudyUsabilityQuestionnaire(PSSUQ)-------------------------------------------------------20

Page 5: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

D1.5DemandandsupplyanalysisreportPage5of21

2018©Copyrightlieswiththerespectiveauthorsandtheirinstitutions.

1. ExecutiveSummaryWP1 primarily dealswith the design and execution of a series of studies on the demand for datasciencetraining.ThesestudieshaveproducedanevidencebaseofdemandfordatasciencetrainingaswellaslistinggapsinanyexistingtrainingacrossEurope.Theresultsofthisstudyhavebeenanalysedinordertoinformtheiterationsoftheproposedcurricula(WP2)andresultanttraining(WP3).

FollowingtheoutcomesoftheM18projectreview,WP1hasbeenamendedtoalsoincludeananalysisof the supply of learning resources, in order to match the identified demand. The results of thisdemand and supply analysis are reflected in the EDSA dashboard. Supply information is collectedsemi-automatically,withafocusonrelevantinstitutionsfromEuropeaswellasonlineofferings.Theinformation is analysed automatically to identify relevant skills, which are then mapped to jobdescriptions.

ThisdeliverablereportsthetypesofsupplyanalysesconductedinrelationtothedemandanalysisandhowthisworkhasbeenimplementedintheEDSAdashboard.Thisdeliverablealsodescribeshowthedemandandsupplydataarepresentedinthedashboardandhowusersareabletointeractwiththisdatainordertoexplorethecurrentdemandandsupply.

Page 6: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

Page6of21EDSAGrantAgreementno.643937

2. IntroductionIntoday'sjobmarkettherequiredskillsareconstantlyevolving. Thiscanbeseeninmoretechnicalfieldssuch aswebdevelopment anddata sciencewherenew tools and librariesaredevelopedandavailable to thepublicwithanincreasingrate. This isvisible inbothresearchand industrysectorswhereajobpositionmightrequireapreviouslyunseenskillandtheapplicantneedstolearnittobequalified. Finding the courses that would give the skill knowledge can be tedious and does notguaranteeitssufficiency.

TheEDSAdashboard1connects the jobmarketskilldemandwiththecourses thatgive therequiredskillknowledge.TheEDSAdashboardenablesuserstosearchfortheirdesiredjobposition,findoutwhatistherequiredskillsetandwhicharetheappropriatelearningmaterialsandcoursestoacquirethemissingskills.Wehavefocusedonjobpositionsthatrequiredatascienceskillsandcoursesthatare provided by acknowledged course providers. Additionally, the dashboard shows the mostdemandedskillsandhiringlocationforthegivenresults.

ThemaincontributionsoftheEDSAdashboardarea)creatingasizabledatasetofdatasciencerelatedjob postings containing the job postings title, description, locations and other information, and b)developingadashboardwhich foragivenqueryshowsrelevant jobpostingsaswellascoursesandlectures which give the appropriate skills. The dashboard is daily updated with new job postingsshowingthemostrecentchanges.Basicstatisticssuchasthemostpopularjoblocationsandskillsarealsoshown.

Theremainderofthisdeliverableisorganisedasfollows.First,anoverviewoftheuserinterfaceofthedashboardisprovided,followedbyanin-depthlookintotheback-endarchitecture.Wethenpresentthemechanismsbehindharvesting various typesof data for thedashboard, namely jobs, skills andcourses.Finally,theuserevaluationofthedashboardispresentedandthedeliverableisconcluded.

3. DashboarduserinterfaceTheEDSAdashboardallowsuserstoexploreboththecurrentdatascienceskillsdemandandsupply.Usersofthisdashboardareablenotonlytoexplorethecurrentdemandinthedatasciencemarket,butalsofindlearningmaterialsandtrainingrelevanttotheskillstheywillneedtosecureaspecificjobposition.Additionally,usersaresupported inbuildingpersonalised learningpathways,consistingofcoursesandlearningmaterialsthatwillhelpthemreachtheirlearninggoals.

Inparticular,theEDSAdashboardallowsusersto:

● ViewthecurrentdemandfordatasciencejobsandskillsacrossEurope.● Filterdemandbyrequiredskillsandregion.● Viewtrendsandstatisticsregardingdatasciencejobsandskillsforagiventimeframe.● Explore the current supply of courses and learning materials that will help them acquire

certainskills.● Buildpersonalisedlearningpathwaystowardsacquiringcertainskills.

The following figures show different views of the EDSA dashboard. Figure 1 shows a job searchperformedviathedashboard.Thisviewisdeliberatelykeptassimpleaspossible.Inthisview,queriestypedintothesearchboxatthetopresultinasimplelistofrelateddatasciencejobs.Selectinganyjobresultsinadditionaldetailsofthepostbeingdisplayed.Thetoolbarbelowthequeryentryboxallowsuserstoaddorremoveadditionalviews.

1http://edsa-project.eu/resources/dashboard/

Page 7: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

D1.5DemandandsupplyanalysisreportPage7of21

2018©Copyrightlieswiththerespectiveauthorsandtheirinstitutions.

Figure1:JobsearchintheEDSAdashboard.

InFigure2,amapviewhasbeenselected.Googlemapsareusedforthemapviewincorporatingzoomfacilities.Selectinganyanchorpointinthemapbringsupdetailsaboutthejob.InFigure3,thecoursesviewhasbeenselected.Thecoursesviewshowsrecommendedcoursesrelatedtothequery,whichareoffered by the EDSA project consortium and external organisations. This view also displaysrecommendedlearningpathwaysbasedontheperformedsearch.

Page 8: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

Page8of21EDSAGrantAgreementno.643937

Figure2:Jobsearchwiththemapviewselected.

Figure3:Jobsearchwiththecoursesviewselected.

Page 9: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

D1.5DemandandsupplyanalysisreportPage9of21

2018©Copyrightlieswiththerespectiveauthorsandtheirinstitutions.

Figure4 showsadditional filters thatusers can enablewhen searching.These filtersallowusers tospecifytherequiredskills,thelocationofjobs,aswellasthestartandendtimeofjobadverts.

Figure4:Thedashboardsearchfilters.

The EDSA dashboard not only allows users to explore the current listings of jobs and associatedlearningresources,butalsostructurestherecommendedlearningresourcesintolearningpathways,which can be further customised and personalised by users. In order to build their personalisedpathways,usersofthedashboardstartbysearchingforcertainjobpositions.Basedontheirsearches,thedashboardrecommendscoursesandlearningpathwaysforgainingtherequiredskills.UsersmayfollowthesepathwaysorfurtherpersonalisethembyvisitingtheEDSAcoursesportal.2

Figure5showsthelistofdatasciencelearningpathwayscurrentlyofferedbytheEDSAcoursesportal.These pathways have been adapted from the EDISON Data Science Framework3 and consist ofrecommendeddatasciencetopics,aswellascoursesforacquiringcertainsetsofskillsrelatedtothesetopics.Users canuse thesepathways as templates inorder tobuild their ownpathwaysby addingcourses,monitoring theirprogress towards completing their pathways, aswell as reflecting on thecontentsofthepathwaysandonwhattheyhavelearned,asshowninFigure6.

Figure5:DatasciencelearningpathwaysintheEDSAcoursesportal.

2http://courses.edsa-project.eu3http://edison-project.eu/edison/edison-data-science-framework-edsf

Page 10: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

Page10of21EDSAGrantAgreementno.643937

Figure6:Buildingapersonalisedlearningpathway.

4. DashboardsoftwarearchitectureInthissection,wepresentthecontentretrievalmethodologyanddescribethedifferentcomponentsofthedashboard.Thecontentisretrievedbyinsertingaquerytextinthesearchinput.TheusermayaddadditionalqueryconditionsbyselectingtheDataScienceskills,locations,countriesandatimeintervalinwhichthejobpostingswerepublished.Uponsubmitting,thequeryisusedtofetchthecontentthatmatchestheconditions.Whileallqueryvaluesareusedforretrievingjobpostings,onlytheinputtextand skills are used for retrieving the courses and video lectures content. Since courses and videolectures are available online the location and time interval are irrelevant for retrieving the supplycontent.

Toretrievethecontent,wefirstneedtosetanappropriateindex.ThejobpostingdatasetisindexedbyWikipedia concepts,Data Science skills, locations, countriesandpublisheddatewhile the courseand lecture data sets are indexed only by Wikipedia concepts. The query text is sent throughwikification to acquireWikipedia conceptswhichareused for retrieving the relevant content.Next,additional query conditions are used to filter out the content. The remaining content is used tocalculate themostdemanded skills andhiring locations. Finally, thequery results are returnedandused to update the dashboard components. This process is developed using QMiner [1], a dataanalytics platform for processing large-scale real-time streams containing structured andunstructureddata.

Thedashboardiscomposedofdifferentcomponents.Thelargestcomponentisalistof jobpostings.Eachjobpostingispresentedbyitsextractedinformation,includingtheDataScienceskillsextractedfromthe titleanddescription.Figure7showsanexampleofa jobposting inthe list.SinceWikifiersupportscrossandmulti-lingualitythelistconsistsofjobpostingswrittenindifferentlanguages.

Page 11: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

D1.5DemandandsupplyanalysisreportPage11of21

2018©Copyrightlieswiththerespectiveauthorsandtheirinstitutions.

Figure7:Exampleofajobpostingreturnedbythequery“machinelearning”.Eventhoughthe

jobpostingiswritteninSpanishthemethodologyfindsitrelevant.

Iftheuserdoesnothavetherequiredskillsetitcanbeacquiredbyenrollingintocoursesshowninthecourselist.Thelistshowscoursesofferedbydifferentonlinecourseprovidersthatarerelevanttotheusersinputquery.Figure3(section3)showsthecomponentcontainingthecourselist.Leftandrightarrowsareusedtonavigatethroughthelistwhereeachcourseispresentedbyitsnameandacourseprovider.Additionally,theusercanwatchlecturestogetadeeperunderstandingofaproblem.Similarto courses the video lectures list shows relevant content found on VideoLectures.NET. Clicking thelectureredirectstheusertothevideolecturehomepage.

Thedashboardalsoshowsthemmostdemandedskillsandjobpostingtimeline.Thetimelineshowshow the ratio between queried and all job postings changed since the start of the year 2016.Additionally,thisshowsatrendoftheskilldemandinthequeriedjobpostingsubset.Figure8showsthevisualizationsusedtoshowtheskilldemandandtimeline.

Figure8:Ontheleftthetenmostdemandedskillshistogram,andontherightthenumberofjobpositionstimeline,forthequery“machinelearning”.Hoveringoverthehistogramcolumn

showsthenumberofqueriedjobsdemandingtheskill.

Page 12: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

Page12of21EDSAGrantAgreementno.643937

Figure9:TheEDSADataacquisitionprocess.

TheEDSAdataacquisitionprocess,showninFigure9,consistsoftwoparallelpipelinesthatoperateon the extracted (and transformed) demand and supply data. The Wiki-based Tagging and Geo-enrichment (WTG) pipeline shown on the top (responsible: JSI) complements the Ontology-basedInformationExtraction(OBIE)pipeline shownbelow (responsible: Fraunhofer), bydiscoveringnewskillsthatarenotknowntotheontology(SARO4).Atthemoment,theseneedtobemanuallyaddedtobe recognised in the future. The OBIE pipeline relies on a customised GATE5 pipeline. The OBIEpipelineisbetterintermsofprecision,whereastheWTGpipelinecompensatesforincompleterecall.

Theresults frombothpipelinesare integratedusing theSilkFramework,6so thatall theknowledgediscoveredisattachedtothesamedata.Followingtheentityreconciliationprocess,resultsarestoredfordirectaccessbytheEDSAdashboard.Separately,asnapshotofthedataset is routinelystored inordertoenabletimeseriesanalysis(theresultsarenotintegratedintheEDSAdashboard).

4.1 HarvestingjobsOpen jobpositionscanbe foundusing jobsearchservices.Theseservicesaggregate jobpostingsbylocation,sector,applicantqualificationsandskill setor type. Onesuchservice isAdzuna,7asearchengine for job ads which mostly covers English speaking countries. Another service is Trovit,8 aleadingsearch engine for classified ads inEurope andLatinAmerica.The service is available in13differentlanguagesandprovideslistingsofjobsaswellascars,realestateandotherproducts.Whenapplyingforajobpositiontheapplicantrequirestohaveacertainskillset.Iftherequirementsarenotfulfilled,hecanenrol incourses toget themissingskills.Additionally,watchingcertainlecturescan

4http://vocol.iais.fraunhofer.de/saro/5https://gate.ac.uk/6http://silkframework.org/7https://www.adzuna.com/8https://www.trovit.com/

Page 13: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

D1.5DemandandsupplyanalysisreportPage13of21

2018©Copyrightlieswiththerespectiveauthorsandtheirinstitutions.

give a deeper understanding of a particular problemwhich can increase the probability of gettingacceptedforajobposition.

Jooble9,Indeed10andXING11havealsobeenintegratedasadditionalsourcesfordemanddata(routineextractionofjobpostings).Table1summarisesalldemandsources.

Table1:Summaryofdemandsources.

DemandDataSource ResponsibleforIntegrationinDashboard

Adzuna JSI

Trovit JSI

Jooble Fraunhofer

Indeed Fraunhofer

XING Fraunhofer

Sinceweneededacontinuousflowofdata,wedevelopedapipelineforacquiringjobpostings,coursesandlectures.Thiswillallowustoprovidethedashboard,presentedabove,withthemostrecentdata.ForjobpostingswetargetedtheportalslikeAdzunawithanemphasisonpositionsinDataScience.

Fordataacquisitionandenrichment,wecollecteddataeitherusingdedicatedAPIs,includingAdzunaAPI,12aswellascustomwebcrawlers.ThedatawasformattedtoJSONtoaidfurtherprocessingandenrichment.

Thenextstepofdatapreprocessingiswikification-identifyingandlinkingtextualcomponentstothecorrespondingWikipediapages[2].ThisisdoneusingWikifier,13whichalsosupportscrossandmulti-lingualityenablingextractionandannotationofrelevantinformation from jobpostings,coursesandvideolecturesindifferentlanguages.Wikificationwillallowustosearchforjobpostings,coursesandlecturesinmultiplelanguages.

Next,weuse the Skill andRecruitmentOntology (SARO) [3] to extractData Science skills from jobpostings.ForeachjobpostingwematchtheWikipediaconceptswiththeskillsfoundinSAROontologyanddeclarethematchedconceptsasDataScienceskills.Theseskillsarethenaddedtothejobpostingprofile.Finally,toallowsearchingbylocationsandcountriesthejobpostingswerefurtherenrichedbyusingGeoNames14ontologytoincludethelatitudeandlongitudeandthecorrespondingGeoNamesIDandthelocationname.

9https://jooble.org10https://indeed.com/11https://www.xing.com/jobs12https://developer.adzuna.com/13http://wikifier.org/14http://www.geonames.org/

Page 14: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

Page14of21EDSAGrantAgreementno.643937

Thejobpostingsdatasetcontainsalmost3.3Mjobpostingsacquiredintheperiodof18months.Jobpostingswerelocatedfor144differentcountries,themajorityofthemfromEurope.Figure10showsthetopfifteencountrieswithmostfoundjobpostings.TheUKdominatesothercountrieswith906kjobpostings,followedbyFrancewithalmost539k.

Figure10:Topfifteencountrieswithmostfoundjobpostings.Thegreatestnumberofjob

postingswerefoundforUK,followedbyFranceandGermany.

4.2 HarvestingskillsTherewere650uniqueDataScienceskillsextractedfromthedataset.Theseincludesoftskills,suchas leadership and management, knowledge of a particular domain, such as machine learning andartificial intelligence,andprogramminglanguages.Figure11showsthemostdemandedskillsinthedataset.

Figure11:Topfifteenmostdemandedskills.Theyaremostlycomprisedofhigh-levelskills,

suchas“database”and“computerscience”,andspecificprogramminglanguages.

Page 15: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

D1.5DemandandsupplyanalysisreportPage15of21

2018©Copyrightlieswiththerespectiveauthorsandtheirinstitutions.

4.3 HarvestingcoursesForcourseswetargeteddifferentcourseproviders,includingCoursera,15providingcoursesfromtopuniversities,andHackr.io,16aservicewhichfindsthebestonlineprogrammingcourses&tutorials.Wealso targetedVideoLectures.NETtoacquirevideolecturescontainingtheDataScience tag. The tagsaregivenmanuallybytheVideoLecturesteam.

Canvas17,edX18andUdemy19havealsobeenintegratedasadditionalsourcesforsupplydata(routineextractionofrelevantcourses).Table2summarizesallsupplysources.

Table2:Summaryofsupplysources.

DemandDataSource ResponsibleforIntegrationinDashboard

Coursera JSI

VideoLectures JSI

Canvas Fraunhofer

edX Fraunhofer

Udemy Fraunhofer

Thecoursedatasetcontains information forover63kcourses, including their title,descriptionandcourse providers. The data set is comprised of over 8k courses available online and 55k offlinecourses. Figure 12 shows the distribution of online courses by course providers. Themost courseswereacquiredfromCourserawithabove4k,followedbyHackr.ioat2k.

15https://www.coursera.org/16https://hackr.io/17https://www.canvas.net/18https://www.edx.org/19https://www.udemy.com/

Page 16: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

Page16of21EDSAGrantAgreementno.643937

Figure12:Thedistributionofonlinecoursesbycourseproviders.Themostcourseswere

acquiredfromCoursera,followedbyHackr.io.

VideoLectures.NET20isanaward-winningfreeandopenaccesseducationalvideolecturesrepository.Itcontainsvideosofindividuallecturesaswellaslecturesgivenatrenownconferences.

Weacquiredadatasetofover20klecturespublishedonVideoLectures.NET.Itcontainsinformationabout the lectures available on the video repository including title and description and link to thelecture.

5. Dashboardevaluation

5.1 EvaluationGoalsInorder to validate theusabilityandusefulnessof theEDSADashboard,we conducteda formativeusabilityevaluation.Theevaluationwasperformedtoidentifytowhattheforeseenuserscan:

1. Easilyperformbasictaskssuchasearchforajoborunderstandthelearningpaths.2. Findthedashboardeasyandenjoyabletouse.

The following tasks were identified beforehand, and they defined the evaluation exercise. Theevaluatorswereaskedto:

1. Search/IdentificationT1.SearchforarelevantdatascienceJobinGermany

T2.Searchforajobbasedonspecificskills(e.g.PythonandJava)

T3.Identifywhatstepsaremissinginone’slearningpathtobecomeadatascientist

2. Analysis/ExplorationT4.Analysewhichcountryhasmoresuitablejoboffers

T5.Identifywhicharethetop3relevantskillsforadatascientist

20http://videolectures.net/

Page 17: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

D1.5DemandandsupplyanalysisreportPage17of21

2018©Copyrightlieswiththerespectiveauthorsandtheirinstitutions.

5.2 EvaluationSetupNumberofparticipants

6StudentAssistants(DataSciencearea)fromFraunhoferIAISparticipatedintheevaluation.

Moderator

1moderatorcarriedouttheevaluation.Themoderatorwasopentodiscussionwiththeparticipants,explainingtheobjectiveofthetaskswithoutgivinganyadditionalinstructionsonhowtocompleteatask.

Dashboardversion

Theversionusedduringthewasthelastdeploymenton22.11.2017

http://edsa-project.eu/resources/dashboard/

Metrics

Wedefinedthefollowingevaluationmetrics:

1. NASATask Load Index (TLX)21: In all caseswe used the standardNASA Task Load Index(TLX)tomeasureworkloadinlooselytime-constrainedtasks.

2. ThinkingAloudprotocol[4]touncoverusabilityissues.3. To measure the usability, we will use a simplified version of Post-Study Usability

Questionnaire(PSSUQ)[5].

Figure13:TheNASATaskLoadIndexusedduringtheDashboardusabilityevaluation.

21https://humansystems.arc.nasa.gov/groups/tlx/

Page 18: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

Page18of21EDSAGrantAgreementno.643937

Figure14:ThePost-StudyUsabilityQuestionnaireusedduringtheDashboardusability

evaluation.

Procedure

The moderator carried out the experiment at each participant’s desks, in order to provide acomfortable environment for them. The evaluation introduction also gave some background of theEDSAproject,beforeexplainingthe tasksrequested.Participantswere told tospeak freelyandloudwhat they are thinkingduring their attempts to carryout the tasks in theEDSADashboard; all thewhile,themoderatortooknoteofalltheirremarks.Aftereachtask,theNASATLXquestionnairewasgiven toeachparticipant.Attheendofall tasks, thePSSUQquestionnairewasalso filledoutby theparticipants.

No time limitwasappliedbuttheparticipantswereaskedtostopattempting tocomplete the taskswhentheygaveup.Thetimerequiredwasrecorded,butnotimerwasshowntotheparticipantssoasnottointroducetimepressurefortheparticipantswhileperformingthetask.

5.3 EvaluationResultsandDiscussionIn general, all the evaluation showed positive scores, indicating that the participantswere able tocompletethetasksandtheworkloaddemandusingtheDashboardwaslow.Additionally,theusabilityscoresshowedanoverallusersatisfactionwiththeDashboard.

ThemostprominentdifficultyintheDashboard’susewastheintuitivenessduringfirstuse,butthisshowed signs of improvement when performing the next tasks. A How-to Video for new users isthereforehighlyrecommended.

Figure15showstheresultsoftheNASATLX.Overall,alltheNASAindexesindicateapositiveresult,Task3and5were theoneswith the lowestlevelof requiredeffort. Incontrast,Task1showedthehighestleveloffrustration,althoughthismayberelatedtothelackofguidanceatfirstuse(seeabove).

Page 19: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

D1.5DemandandsupplyanalysisreportPage19of21

2018©Copyrightlieswiththerespectiveauthorsandtheirinstitutions.

Task4was the task thatrequiredmosteffort toaccomplishandwith the lowestperformanceofalltasks.Ingeneral,themapviewprovedtobesomewhatconfusing,aswellastheinterpretationofitsvalues.

Figure15:WorkloadanalyticsfromTask1(T1)-Task5(5).Mean(M)andStandarddeviation(STD)arecalculated.Greentonesshowpositivelylowestratings,andredtonestheopposite

higherratings.

Figure16showstheboxplotoftheNASATLXscoreintermsofperformance.Theuserswereaskedtoratehowsuccessfultheyfeltaccomplishingatask,fromPerfecttoFailure.Overallallthetaskswerecompleted.Task4wastheonlyone(1user)thatfailedtobecompleted.

Figure16:Boxplotofself-perceivedsuccessinaccomplishingeachtask,accordingtoNASALTX

results.

Figure17showstheboxplotoftheNASATLXscoreintermsoffrustration.Theuserswereaskedtoratehowfrustratedtheyfeltwhileperformingthetask.OverallthemeanofusersfeltlowfrustrationwhileusingtheDashboard.Task1,2showsomeexceptionswhichmayberelatedtotheeffectoffirst-use.Task4showsahigherdegreeoffrustrationrelatedtotheuserwhocouldnotcompletethetask(asshowninFigure16).

Page 20: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

Page20of21EDSAGrantAgreementno.643937

Figure17:Boxplotofhowinsecure,discouraged,irritated,stressed,andannoyedwerethe

participantsduringthetaskexecution,accordingtoNASALTXresults.

Figure18showstheresultsofthePSSUQquestionnaireaboutthegeneralusabilityoftheDashboard.All the shown dimensions achieved good scores, showing that the Dashboard is usable andparticipantsfounditeasytolearn,andweresatisfiedwithit.

Figure18:Post-StudyUsabilityQuestionnaire(PSSUQ)

ThemostrelevantThinkAloudremarksgatheredduringtheevaluationprocesswere:

Page 21: D1.5 Demand and supply analysis report - EU · The EDSA dashboard allows users to explore both the current data science skills demand and supply. Users of this dashboard are able

D1.5DemandandsupplyanalysisreportPage21of21

2018©Copyrightlieswiththerespectiveauthorsandtheirinstitutions.

1. Mostusershadproblemslocatingthesearchqueryfields.2. MostusershadpositivereactiontoLearningPathwaysoncetheylocatedit.3. Mostuserswereconfusedwiththepurposeofthebuttonsontopofthepage.

6. ConclusionLinkingthedemandfordatascienceskillswiththesupplyoflearningresourcesthatoffertheseskillsiscrucialforbridgingthedatascienceskillsgap.Towardsthisgoal,EDSAhasdevelopedaninteractivedashboardthatenablesitsuserstoexploreboththecurrentdatascienceskillsdemandandsupply.

This deliverable has presented the types of supply analyses conducted in relation to the demandanalysis and how this work has been implemented in the EDSA dashboard. In particular, thisdeliverablehasdescribedhowthedemandandsupplydataarepresentedintheEDSAdashboardandhowusersareabletointeractwiththisdatainordertoexplorethecurrentdemandandsupply.

Themain contributions of this work are: a) creating a sizable data set of data science related jobpostings containing the job postings title, description, locations and other information, and b)developing a dashboard that offers relevant job postings for a given query, aswell as courses andlearningpathways for gaining the appropriate skills. The dashboard is updateddailywith new jobpostings and courses and will continue to be maintained beyond the lifetime of the project, thusofferingasustainableservicetothedatasciencecommunity.

Itshouldbenotedthatthedashboardwillcontinuetobemaintainedandfurtherdevelopedaftertheendoftheproject.Furtherimprovementswillincludetheadditionofatimeseriesanalysisbasedonthedatabeingcollected,whichistobeimplementedbyFraunhofer.Moredetailsaboutthisplannedworkareprovidedin theProjectExploitationReport (D5.4).Additionally,usability issues thatwereidentifiedintheevaluationofthedashboard,suchasusabilityimprovementsofthemapviewandthesearchqueryfieldsaswellasotherlayoutissues,willbetakenuponbytheteamthatwillcontinuetosupportthedashboardaftertheproject’send.

7. References[1]B.Fortuna,J.Rupnik,J.Brank,C.Fortuna,V.Jovanoski,M.Karlovcec,B.Kazic,K.Kenda,G.Leban,A.Muhic,etal.qminer:Dataanalyticsplatformforprocessingstreamsofstructuredandunstructureddata,softwareengineeringformachinelearningworkshop.InNeuralInformationProcessingSystems,2014.

[2]L.Ratinov,D.Roth,D.Downey,andM.Anderson.Localandglobalalgorithmsfordisambiguationtowikipedia.InProceedingsofthe49thAnnualMeetingoftheAssociationforComputationalLinguistics:Human Language Technologies-Volume 1, pages 1375–1384. Association for ComputationalLinguistics,2011.

[3]E.Sibarani,S.Scerri,N.Mousavi,andS.Auer.Ontology-basedskillsdemandandtrendanalysis,July2016.

[4]JosephS.DumasandJaniceC.Redish.1999.APracticalGuidetoUsabilityTesting(1sted.).IntellectBooks,Exeter,UK.

[5] J. R. Lewis. IBM computer usability satisfaction questionnaires: psychometric evaluation andinstructionsforuse".In:InternationalJournalofHuman-ComputerInteraction7.1(1995),pp.57.