13
1 The Technische Universität Berlin Faculty IV Electrical Engineering and Computer Science The Data Analytics Track: A Guidance Document Juan Soto, Dr. Ralf Detlef Kutsche, and Prof. Dr. Volker Markl Database Systems and Information Management (DIMA) Group Last Updated: October 10, 2017 Email: [email protected] Synopsis. In Fall 2013, TU Berlin’s Faculty IV Electrical Engineering and Computer Science (EECS) approved a new track, which enables students pursing a M.Sc. in Computer Science, Information Systems Management or Computer Engineering, to specialize in data analytics. To meet the track requirements, students must complete courses in three core competencies: (1) scalable data analytics, (2) scalable data management, and (3) a domain-specific application area. This guidance document offers students general advice, in the selection of courses, procedure to follow when identifying a thesis topic, and prospective career possibilities. Students who complete both their respective M.Sc. degree and track requirements, will receive – besides their M.Sc. degree – a Data Analytics Master’s Track Certificate issued by Faculty IV, in addition. 1. Motivation 1 The last decades were marked by the digitization of virtually all aspects of our daily lives. Today, industry, government institutions and NGOs, and, of course, science and engineering face an avalanche of digital data daily. Partially due to a reduction in disk storage costs, a paradigm shift towards cloud storage services, and the ubiquitous availability of networked devices. At first glance, this appears to be favorable for our increasingly networked society. However, in many ways it is a burden. Data (often appearing as ‘raw data’) is neither information, nor knowledge. Nonetheless, data is of great value, once it has been refined and analyzed, to address well-formulated questions, concerning problems of interest. It is only then that economic and social benefits can be fully realized. Modern big data analytics questions are often solved using techniques drawn from varying fields, including graph and network analysis, machine learning, mathematics, statistics, signal processing, and text processing, among others. Currently, data scientists, well versed in scalable data analysis methods, scalable systems programming, and knowledge in an application domain are needed to derive insight from big data. Unfortunately, data scientists with skills in both scalable systems and (potentially domain specific) data analysis methods are few in number. They are expensive and in high-demand. Consequently, this limits the amount of value that can currently be generated from big data for society as a whole. Moreover, despite the ever-increasing number of data science programs (at universities worldwide) and student enrollments, it will still be impossible to educate these so-called Jack-of-all-trades, as the required skills are both complex and diverse (as depicted in Figures 1 and 2). Before big data, only few programmers with MPI expertise, predominantly located in supercomputing centers were sufficient in number. For many decades, software engineers and general users in varying domains did not have to worry about scalability issues in their computing systems, thanks in part to higher-level programminglanguages,compilers,anddatabasesystems. 1 The motivation section was predominantly drawn from Prof. Volker Markl [1, 2].

The Technische Universität Berlin Faculty IV Electrical ... · In Fall 2013, TU Berlin’s Faculty IV Electrical Engineering and Computer Science (EECS) approved a new track, which

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Technische Universität Berlin Faculty IV Electrical ... · In Fall 2013, TU Berlin’s Faculty IV Electrical Engineering and Computer Science (EECS) approved a new track, which

1

TheTechnischeUniversitätBerlinFacultyIVElectricalEngineeringandComputerScienceTheDataAnalyticsTrack:AGuidanceDocumentJuanSoto,Dr.RalfDetlefKutsche,andProf.Dr.VolkerMarklDatabaseSystemsandInformationManagement(DIMA)GroupLastUpdated:October10,2017Email:[email protected]. In Fall 2013, TU Berlin’s Faculty IV Electrical Engineering and Computer Science (EECS)approvedanewtrack,whichenablesstudentspursingaM.Sc.inComputerScience,InformationSystemsManagementorComputerEngineering,tospecializeindataanalytics.Tomeetthetrackrequirements,studentsmustcompletecourses in three core competencies:(1)scalabledataanalytics,(2)scalabledatamanagement,and(3)adomain-specificapplicationarea.Thisguidancedocumentoffersstudentsgeneraladvice, intheselectionofcourses,proceduretofollowwhen identifyingathesistopic,andprospectivecareerpossibilities.StudentswhocompleteboththeirrespectiveM.Sc.degreeandtrackrequirements,willreceive–besidestheirM.Sc.degree–aDataAnalyticsMaster’sTrackCertificateissuedbyFacultyIV,inaddition.1. Motivation1Thelastdecadesweremarkedbythedigitizationofvirtuallyallaspectsofourdailylives.Today,industry,governmentinstitutionsandNGOs,and,ofcourse,scienceandengineeringfaceanavalancheofdigitaldatadaily.Partiallyduetoareductionindiskstoragecosts,aparadigmshifttowardscloudstorageservices, and the ubiquitous availability of networked devices. At first glance, this appears to befavorableforourincreasinglynetworkedsociety.However,inmanywaysitisaburden.Data(oftenappearingas‘rawdata’)isneitherinformation,norknowledge.Nonetheless,dataisofgreatvalue, once it has been refined and analyzed, to address well-formulated questions, concerningproblemsofinterest.Itisonlythenthateconomicandsocialbenefitscanbefullyrealized.Modernbigdataanalyticsquestionsareoftensolvedusingtechniquesdrawnfromvaryingfields,includinggraphandnetworkanalysis,machinelearning,mathematics,statistics,signalprocessing,andtextprocessing,amongothers.Currently,datascientists,wellversedinscalabledataanalysismethods,scalablesystemsprogramming,andknowledgeinanapplicationdomainareneededtoderiveinsightfrombigdata.Unfortunately,datascientistswithskillsinbothscalablesystemsand(potentiallydomainspecific)dataanalysismethodsarefewinnumber.Theyareexpensiveandinhigh-demand.Consequently,thislimitstheamountofvaluethatcancurrentlybegeneratedfrombigdataforsocietyasawhole.Moreover,despitetheever-increasingnumberofdata scienceprograms (atuniversitiesworldwide)andstudentenrollments, itwillstillbeimpossibletoeducatetheseso-calledJack-of-all-trades,astherequiredskillsarebothcomplexanddiverse(asdepictedinFigures1and2).Beforebigdata,only fewprogrammerswithMPIexpertise,predominantlylocatedinsupercomputingcentersweresufficientinnumber.Formanydecades,softwareengineersandgeneralusersinvaryingdomainsdidnothaveto worry about scalability issues in their computing systems, thanks in part to higher-levelprogramminglanguages,compilers,anddatabasesystems. 1ThemotivationsectionwaspredominantlydrawnfromProf.VolkerMarkl[1,2].

Page 2: The Technische Universität Berlin Faculty IV Electrical ... · In Fall 2013, TU Berlin’s Faculty IV Electrical Engineering and Computer Science (EECS) approved a new track, which

2

Incontrast,today’sexisting technologieshavereachedtheirlimitsduetobigdatarequirements,whichinvolvedatavolume,datarateandheterogeneity,andthecomplexityoftheanalytics.Indeed,theneedformoreadvancedanalyticswillgobeyondrelationalalgebra.Theywillneedtoemploycomplexuser-definedfunctionsandsupportbothiterationsanddistributedstate.

Figure1.AdatascientistisaJack-of-All-Trades.Intheeraofmany-coreprocessors,cloudcomputing,andNoSQL,wemustensurethatwell-establisheddeclarativelanguageconcepts(inherentinrelationaldatabasesystems)maketheirwayintobigdatasystems.Tomakethisareality,theresearchcommunitywillneedtoaddresstherelatedchallenges.Forexample, (i) designing a programming language specification that does not require systemsprogramming skills, (ii)mappingprogramsexpressed in thisprogramming language toa computingplatformoftheirownchoosing,and(iii)executingtheseinascalablemanner.Thismeans devising execution strategies thataredistributed, parallelized, and supportboth in-memory technologies and out-of-core execution for data-intensive algorithms. To meet thischallengethecompiler,dataanalysis,databasesystems,distributedsystems,andmachinelearningcommunities,amongothers,willhaveto cometogether.Wewillhavetodevelopnovelscalablealgorithmsandsystemsthatcanorganizethedatadelugeanddistillinformationtocreatevalue.Furthermore,thepowerofdeclarativelanguages,toenableautomaticoptimization,parallelization,and the adaptationof aprogram tovaryingdistributed systems andnovelhardwarearchitectures(dependingondatadistribution,datasize,datarate,andsystemload)mustbepreserved.Inthisway,wewillovercomethecurrent“stoneage”inbigdataanalytics.Thatis,algorithmspecificationsinsystems thatdonotautomaticallyoptimize(e.g.,MPI,MapReduce),imperativelanguages(e.g.,C),object-orientedlanguages(e.g.,Java),andrelational-orientedlanguages(e.g.,SQL,XQuery)withnon-tunableexternaldriverprograms,andtechnicalcomputingsystems(e.g.,R,MATLAB)thatdonotscale.

Page 3: The Technische Universität Berlin Faculty IV Electrical ... · In Fall 2013, TU Berlin’s Faculty IV Electrical Engineering and Computer Science (EECS) approved a new track, which

3

Figure2.Deepanalysisisthenameofthegame.2. DetailedDescriptionsoftheDataAnalyticsTrackRulesPleasestudythefollowingsubsectionsverycarefully,mostofyourquestionsshouldbeanswered.2.1 QualificationandMainCompetenceAreas. TheDataAnalyticsMaster’s TrackqualifiesstudentstopursuecareersasaDataScientist,DataAnalyst,orDataEngineer.Theywill learnaboutdataanalysismethods,theirapplicationtoreal-worldproblemsinvaryingdomains,learnmoreabouttheinternalsofdatabasesystems,anddevelopprogrammingskillswithafocusonmassively-paralleldataprocessingsystems.2.2 Requirements. Students followingthetrackshouldbeenrolled in one of the following TUBerlinMaster’sPrograms:Computer Science (‘Informatik’), Information SystemsManagement (‘Wirtschafts-informatik’)orComputerEngineering(‘TechnischeInformatik’).TheiracceptancetotheDataAnalyticstrackisbydefault.However,verydedicatedstudentsfromotherprogrammes(e.g.Mathematics)alsocouldbeeligible.Thiswillbedecidedonindividualrequest,carefullycheckingtheapplicationandthebackgroundfromtheCV,andfromyour(inthiscase:mandatory)motivationletter.2.3 Prerequisites: Students interested in joining the track should possess: (a) very strong Englishlanguageskills,(b)programmingskillsinfunctional(e.g.,Scala)andobjectoriented(e.g.,Java)programminglanguages,(c)fundamentalskillsindatabasemanagementsystems,and(d)knowledgeinmathematicalfoundations(e.g.,linearalgebra,probability,statistics).

Page 4: The Technische Universität Berlin Faculty IV Electrical ... · In Fall 2013, TU Berlin’s Faculty IV Electrical Engineering and Computer Science (EECS) approved a new track, which

4

2.4CreditPointsandTrackStructure.ToearnaM.Sc.degree,studentsmustachieve120ECTScreditpoints.Ofthese,90ECTScreditpointsmustfulfilltherequirementsdescribedfurtherbelow,toqualityforthetrackcertificate.CreditPoints Competence Course Notes

24ECTS DataAnalytics(DA)

MachineLearning1orMachineIntelligenceI

mandatorycourse

DAElective1 seeAppendixA,Table1DAElective2DAElective3

18ECTSScalableDataManagement

(SDM)

DatabaseTechnology mandatorycourseSDMElective1 seeAppendixA,Table2SDMElective2

6ECTS DomainSpecificApplication(DSA)

DSAElective seeAppendixA,Table3

9ECTS Project ProjectElective seeAppendixA,Table43ECTS Seminar SeminarElective seeAppendixA,Table5

30ECTS Thesis Master’sThesisThethesismustbeadataanalyticsoriented topic, supervised by a TUBerlinDataAnalyticsLabProfessor.

Total:90ECTS 2.5EnrollingintheTrack.Toenroll in thetrack,studentsmust join the“DataAnalyticsTrack”courselocatedathttps://www.isis.tu-berlin.de/2.0/course/view.php?id=1069.

2.6MentoringProgram.TrackparticipantsareinvitedtocontactamemberoftheDataAnalyticsLabtoidentifyamentorandrequestguidance.2.7ChangestotheTrack.Trackrequirementsmaychangeannually.Therefore,studentsarerequiredtoregularlymonitorannouncementspostedontheISISDataAnalyticsTrackforum.

Page 5: The Technische Universität Berlin Faculty IV Electrical ... · In Fall 2013, TU Berlin’s Faculty IV Electrical Engineering and Computer Science (EECS) approved a new track, which

5

AppendixA.RepresentativeListofElectiveCoursesAcrossCompetencyAreas

Table1.Arepresentative(butincomplete)listofeligible2dataanalyticscourses.

Title ECTS Prof Term Lang1. EconometricAnalysisofLongitudinalandPanelData 6 AxelWerwatz Winter EN

2. MachineIntelligenceII 6 KlausObermayer Summer EN

3. MachineLearning2 9 Klaus-RobertMüller Summer EN+DE

4. Microeconometrics 6 AxelWerwatz Winter EN

5. MonteCarloMethodsinMachineLearningandArtificialIntelligence

6 ManfredOpper Summer EN

6. MultivariateAnalysis/BusinessStatistics 6 AxelWerwatz Summer EN

7. NumerischeMathematikfürIngenieureII 10 JörgLiesen Winter DE

8. PraktikumMaschinellesLernen 9 Klaus-RobertMüller Summer EN

9. ProbabilisticandBayesianModellinginMLandAI 6 ManfredOpper Summer EN

10. StochastischeModelle 10 MichaelScheutzow Winter DE

11. TimeSeriesAnalysis 6 AxelWerwatz Winter EN

12. TreatmentEffectAnalysis 6 AxelWerwatz Summer EN

13. Ökonometrie 6 AxelWerwatz Winter EN

14. DigitaleSignalverarbeitung 12 ReinholdOrglmeister Winter DE

Table2.Arepresentative(butincomplete)listofeligible3scalabledatamanagementcourses.

Title ECTS Prof Term Lang

1. AIM1Heterogeneous&DistributedInformationSystems 6 Ralf-DetlefKutsche Winter EN

2. AIM2ManagementofDataStreams 6 VolkerMarkl WinterorSummer

DE

3. AIM3ScalableDataScience:Systems&Methods(SDSSM) 6 VolkerMarkl WinterorSummer

EN

4. CIT9-CloudComputing 6 OdejKao WinterorSummer

EN

5. IDB-PRA:ImplementationofaDatabaseEngine 6 VolkerMarkl Winter EN

6. ParallelSystems 6 Hans-UlrichHeiss Summer EN

7. TheorieVerteilterAlgorithmen 6 UweNestmann Winter EN+DE

2Alternativedataanalyticscoursesmaybeproposed,buttheyaresubjecttoapproval.3Alternativescalabledatamanagementcoursesmaybeproposed,buttheyaresubjecttoapproval.

Page 6: The Technische Universität Berlin Faculty IV Electrical ... · In Fall 2013, TU Berlin’s Faculty IV Electrical Engineering and Computer Science (EECS) approved a new track, which

6

Table3.Arepresentative(butincomplete)listofeligible4domainspecificapplicationcourses.

Title ECTS Prof Term Lang1. AppliedEmbeddedSystemsProject 6 BernardusJuurlink Summer EN

2. Aussenwirtschaft 6 FrankHeinemann Winter DE

3. DigitalCommunities 6 AxelKüpper Winter EN

4. DigitaleNachrichtenübertragung(TechnischeInformatik) 9 ThomasSikora Summer DE

5. ElectronicCommerce 6 AxelKüpper Winter DE

6. Energiewirtschaft-Elektrizitätswirtschaft 6 ChristianHirschhausen Summer DE

7. Energiewirtschaft-TechnologieundInnovation 6 ChristianHirschhausen Summer DE

8. EnergyEconomics 6 GeorgErdmann Winter EN

9. ExperimentalandBehaviouralEconomics 6 DorotheaKübler Summer EN

10. GesundheitsökonomieII 6 MarcoRunkel Winter DE

11. IntegriertesInformationsmanagement 6 RüdigerZarnekow Summer DE

12. IntelligenteSicherheitinNetzwerken 9 SahinAlbayrak Summer DE

13. IT-Service-Management 6 RüdigerZarnekow Winter DE

14. Messtechnik 12 RolandThewes Winter DE

15. MethodsforNetworkEngineering(OR2) 6 ChristianHirschhausen WinterorSummerSummer

EN

16. OpenSourceandIPintheDigitalSociety 6 KnutBlind WinterorSummer

EN

17. PatentrechtundPatentmanagementI 6 JürgenEnsthaler Summer DE

18. Quellencodierung-Multimediasignalverarbeitung(TechnischeInformatik)

9 ThomasSikora Summer DE

19. SemanticSearch 6 SahinAlbayrak Summer DE

20. Signalverarbeitung 6 ReinholdOrglmeister Winter DE

21. SpeechSignalProcessingandSpeechTechnology 6 SebastianMöller Winter EN+DE

22. TechnischeDiagnoseI 6 ClemensGühmann Summer DE

23. TheEconomicsofClimateChange 6 OttmarGeorgEdenhofer Summer EN

24. Wirtschaftsprüfung 6 MaikLachmann Summer DE

4Alternativedomainareaapplicationcoursesmaybeproposed,buttheyaresubjecttoapproval.

Page 7: The Technische Universität Berlin Faculty IV Electrical ... · In Fall 2013, TU Berlin’s Faculty IV Electrical Engineering and Computer Science (EECS) approved a new track, which

7

Table4.Arepresentative(butincomplete)listofeligible5projectcourses.

Title ECTS Prof Term Lang1. IMPRO3-BigDataAnalyticsProject(BDAPRO) 9 VolkerMarkl Winteror

SummerEN

2. Master-Projekt:VerteilteSysteme 9 OdejKao Winter DE

3. ProjectMachineLearning 9 Klaus-RobertMüller Winter EN

4. ProjectNeuralInformationProcessing 9 KlausObermayer Summer EN+DE

5. Project:StatisticalMethodsinAIandML 9 ManfredOpper Winter EN+DE

6. ProjektNachrichtenübertragung 6 ThomasSikora WinterorSummer

DE

Table5.Arepresentative(butincomplete)listofeligible6seminarcourses.

Title ECTS Prof Term Lang

1. AdvancesinSemanticSearch 3 SahinAlbayrak WinterorSummer

DE

2. AnwendungenKognitiverAlgorithmen 3 Klaus-RobertMüller WinterorSummer

DE

3. BDASEM-BigDataAnalyticsSeminar 3 VolkerMarkl Winter EN

4. CIT8-AktuelleThemenausdemBereichderverteiltenSysteme

3 OdejKao Winter DE

5. HotTopicsinNextGenerationNetworksandFutureInternetTechnologies

3 ThomasMagedanz WinterorSummer

EN+DE

6. HotTopicsinOperatingSystemsandDistributedSystems 3 Hans-UlrichHeiSS WinterorSummer

EN

7. IMSEM-SeminarHotTopicsinInformationManagement 3 VolkerMarkl Summer EN

8. IntroductiontoComputationalGenomics 3 ManfredOpper Summer EN

9. MasterSeminar:OperatingComplexITSystems 3 OdejKao WinterorSummer

EN+DE

10. RecentAdvancesinComputerArchitecture 3 BernardusJuurlink Winter EN

11. RecentAdvancesinMulticoreSystems 3 BernardusJuurlink Summer EN

12. SeminarMeasurementandTechnicalDiagnosis 3 ClemensGühmann WinterorSummer

EN+DE

13. SeminarSoftwareEngineeringofEmbeddedSystems 3 SabineGlesner WinterorSummer

EN

14. SynchronousandAsynchronousInteractionsinDistributedSystems

3 UweNestmann Summer EN+DE

5Alternativeprojectcoursesmaybeproposed,buttheyaresubjecttoapproval.6Alternativeseminarcoursesmaybeproposed,buttheyaresubjecttoapproval.

Page 8: The Technische Universität Berlin Faculty IV Electrical ... · In Fall 2013, TU Berlin’s Faculty IV Electrical Engineering and Computer Science (EECS) approved a new track, which

8

AppendixB.ExampleCurriculums*(Thesewillvarydependingonanindividual’sinterests.)*Caveat:Courseavailabilityandspecifications (e.g., language)varyeachsemester.Thus,someofthecontentbelowmaybeinaccurateforthecurrentsemesterorbecomeinaccuratefor the following semester. It is the responsibility of each student to obtain the latestinformationandensurealldegreerequirementsandconstraintsaremet.ARepresentativeExampleofaComputerScience(Stats/EconometricsConcentration)CurriculumOption.

FirstSemester(WinterTerm)

Title ECTS Area Term LangMachineIntelligenceIorMachineLearningI 6 DataAnalytics

(mandatory)Winter EN

DatabaseTechnology 6 ScalableDataMgmt.(mandatory)

Winter EN

e.g.,IDB-PRA:ImplementationofaDatabaseEngine 6 ScalableDataMgmt. Winter ENe.g.,DigitalCommunities 6 Application Winter EN

e.g.,BDASEM-BigDataAnalyticsSeminar 3 Seminar Winter EN

Elective 3 Winter EN+DE

SecondSemester(SummerTerm)

Title ECTS Area Term Lange.g.,MultivariateAnalysis/BusinessStatistics 6 DataAnalytics(DA)

Summer EN

e.g.,TreatmentEffectAnalysis 6 DataAnalytics

Summer EN

e.g.,MachineIntelligenceII 6 DataAnalytics

Summer EN

e.g.,AIM-3ScalableDataScience:Systems&Methods(SDSSM) 6 ScalableDataMgmt.(SDM)

Summer EN

Elective 6 (e.g.,DA,SDM) Summer EN+DE

ThirdSemester(WinterTerm)

Title ECTS Area Term Lang

e.g.,TimeSeriesAnalysis 6 DataAnalytics(DA)

Winter EN

e.g.,Microeconometrics 6 DataAnalytics

Winter EN

e.g.,AIM-2ManagementofDataStreams 6 ScalableDataMgmt.(SDM)

Winter DE

e.g.,IMPRO3-BigDataAnalyticsProject(BDAPRO) 9 Project Winter EN

Elective 3 (e.g.,DA,SDM) Winter EN+DE

FourthSemester(SummerTerm)

Title ECTS Area Term LangMaster’sThesisintheDataAnalyticsLab 30 DataAnalyticsOriented Summer EN+DE

Page 9: The Technische Universität Berlin Faculty IV Electrical ... · In Fall 2013, TU Berlin’s Faculty IV Electrical Engineering and Computer Science (EECS) approved a new track, which

9

ARepresentativeExampleofaComputerScience(Mathematics Concentration)CurriculumOption.

FirstSemester(WinterTerm)

Title ECTS Area Term LangMachineIntelligenceIorMachineLearningI 6 DataAnalytics

(mandatory)Winter EN

DatabaseTechnology 6 ScalableDataMgmt.(mandatory)

Winter EN

e.g.,AIM-2ManagementofDataStreams 6 ScalableDataMgmt. Winter DE

e.g.,IDB-PRA:ImplementationofaDatabaseEngine 6 ScalableDataMgmt. Winter EN

e.g.,DigitalCommunities 6 Application Winter EN

SecondSemester(SummerTerm)

Title ECTS Area Term Lang

e.g.,MachineIntelligenceII 6 DataAnalytics(DA)

Summer EN

e.g.,AIM-3ScalableDataScience:Systems&Methods(SDSSM) 6 ScalableDataMgmt.(SDM)

Summer EN

Electives 18 (e.g.,DA,SDM) Summer EN+DE

ThirdSemester(WinterTerm)

Title ECTS Area Term Lange.g.,DigitalCommunities 6 Application Winter EN

e.g.,BDASEM-BigDataAnalyticsSeminar 3 Seminar Winter EN

e.g.,IMPRO3-BigDataAnalyticsProject(BDAPRO) 9 Project Winter EN

Electives 12 (e.g.,DA,SDM) Winter EN+DE

FourthSemester(SummerTerm)

Title ECTS Area Term LangMaster’sThesisintheDataAnalyticsLab 30 DataAnalyticsOriented Summer EN+DE

Page 10: The Technische Universität Berlin Faculty IV Electrical ... · In Fall 2013, TU Berlin’s Faculty IV Electrical Engineering and Computer Science (EECS) approved a new track, which

10

ARepresentativeExampleofanInformationSystemsManagementCurriculumOption.

FirstSemester(WinterTerm)

Title ECTS Area Term LangMachineIntelligenceIorMachineLearningI 6 DataAnalytics

(mandatory)Winter EN

DatabaseTechnology 6 ScalableDataMgmt.(mandatory)

Winter EN

e.g.,MultivariateAnalysis/BusinessStatistics 6 DataAnalytics

Winter EN

e.g.,OpenSourceandIPintheDigitalSociety 6 Application Winter EN

e.g.,DigitalCommunities 6 Application Winter EN

SecondSemester(SummerTerm)

Title ECTS Area Term Lang

e.g.,AIM3ScalableDataScience:Systems&Methods(SDSSM) 6 ScalableDataMgmt. Summer EN

e.g.,AIM2-ManagementofDataStreams 6 ScalableDataMgmt. Summer DE

e.g.,CIT9-CloudComputing 6 ScalableDataMgmt. Summer EN

e.g.,ApplicationSystemProject 12 Project Summer EN

ThirdSemester(WinterTerm)

Title ECTS Area Term Lang

e.g.,EconometricAnalysisofLongitudinalandPanelData 6 DataAnalytics

Winter EN

e.g.,TimeSeriesAnalysis 6 DataAnalytics

Winter EN

e.g.,Microeconometrics 6 DataAnalytics

Winter EN

e.g.,IMPRO3-BigDataAnalyticsProject(BDAPRO) 9 Project Winter EN

e.g.,IMSEM-SeminarHotTopicsinInformationManagement 3 Seminar Winter EN

FourthSemester(SummerTerm)

Title ECTS Area Term Lang

Master’sThesisintheDataAnalyticsLab 30 DataAnalyticsOriented Summer EN+DE

Page 11: The Technische Universität Berlin Faculty IV Electrical ... · In Fall 2013, TU Berlin’s Faculty IV Electrical Engineering and Computer Science (EECS) approved a new track, which

11

ARepresentativeExampleofaComputerEngineeringCurriculumOption.

FirstSemester(WinterTerm)

Title ECTS Area Term LangMachineIntelligenceIorMachineLearningI 6 DataAnalytics(DA)

(mandatory)Winter EN

DatabaseTechnology 6 ScalableDataMgmt.(SDM)(mandatory)

Winter EN

e.g.,HotTopicsinNextGenerationNetworksandFutureInternetTechnologies

3 Seminar Winter EN+DE

Electives 15 (e.g.,DA,SDM) Winter EN+DE

SecondSemester(SummerTerm)

Title ECTS Area Term Lang

e.g.,MachineIntelligenceII 6 DataAnalytics(DA)

Summer EN

e.g.,ProbabilisticandBayesianModellinginMLandAI 6 DataAnalytics

Summer EN

e.g.DigitaleNachrichtenübertragung(TechnischeInformatik) 9 Application Summer DE

Electives 9 (e.g.,DA,SDM) Winter EN+DE

ThirdSemester(WinterTerm)

Title ECTS Area Term Lang

e.g.,IDB-PRA:ImplementationofaDatabaseEngine 6 ScalableDataMgmt.(SDM)

Winter EN

e.g.,SpeechSignalProcessingandSpeechTechnology 6 Application Winter EN+DE

e.g.,Master-Projekt:VerteilteSysteme 9 Project Winter DE

Electives 9 (e.g.,DA,SDM) Winter EN+DE

FourthSemester(SummerTerm)

Title ECTS Area Term LangMaster’sThesisintheDataAnalyticsLab 30 DataAnalyticsOriented Summer EN+DE

Page 12: The Technische Universität Berlin Faculty IV Electrical ... · In Fall 2013, TU Berlin’s Faculty IV Electrical Engineering and Computer Science (EECS) approved a new track, which

12

AppendixC.QuestionsandAnswersQ1.Whatisatrack?

A1.Ingeneral,atrackisasuggestedsequenceofcoursesthatprofileaspecificspecialization.StudentswhosuccessfullycompletethetrackwillbeawardedacertificatefromFacultyIV.Acertificateindicatesthatastudenthasfollowedastructuredacademicprogramwiththeintenttopursuespecializationindatascience.

Q2.Whocanfollowatrack?

A2.Bydefault,studentsenrolledintheComputerScience(“Informatik”),InformationSystemsManagement (“Wirtschaftsinformatik”) or Computer Engineering (“Technische Informatik”)Master’sprogramsareeligibletopursuethetrack.Studentsfromotherstudyprogramsshouldcontact the “Data Analytics Track Steering Board” at [email protected] to determinewhethertheycanparticipateinthetrack.

Q3.Willmystudyperiodbeextended,ifIfollowthetrack?

A3. No,neithertheamountofECTScredit points,northenumberofsemesterswillincrease.Moreover,alongerstudyperiodwillnotleadtoadisqualificationfromthetrack.

Q4.Howtogoaboutselectingathesistopic?

A4.StudentsshouldspeakwithSeniorResearchers,Postdocs,orPhDstudents,intheparticipatingresearch groups, i.e. “Chairs,” to identify an open thesis topic of mutual interest. For a list ofrepresentativedatascienceorientedpublicationshavealookat[3,4],andforMaster’sThesistopicssee[5].Foraglimpseintoongoingresearchactivitiesinbigdata/datasciencesee[6].Foropenproblemsandavisionofthefutureofcomputersciencesee[7,8,9],respectively.

Q5.Whataremyprospectivecareerpossibilities?

A5. Students who complete the data analytics track are preparedtopursuecareersasDataAnalysts,DataEngineers,orDataScientists.ForinformationaboutbigdataprojectsinindustrywithinGermanyhavealookat[10].Insomecases,studentsenteraPhDprogramwiththeaimtofurtherspecializeinaresearchtopic,suchasdeeplearningorstreamingsystems.Examplesofrecent(DIMAspecific)PhDthesistopics,include[11,12,13,14,15].Formoreinfo.aboutjobopportunitiesandearningpotentialacrossEuropehavealookat[16].

Q6.IfIstillhavequestionsordoubts,notansweredyet?

A6. This document is assumed to be comprehensive. It should address the most relevantquestions. In case of any doubt (e.g., you are enrolled in a different study programme) orconcern,pleasecontactusat [email protected],please look forannouncements(e.g.,theannual“DataAnalyticsTrackIntroPresentation”)postedontheDataAnalyticsTrackforuminISIS.

Q7.HowdoIobtainmycertificate?

A7.Youwillneedtopresentevidence (e.g.,academic transcript) thatyouhavemet the trackrequirements.Oncethishasbeenverified,DIMAstaffwillprepareyourcertificate.

Page 13: The Technische Universität Berlin Faculty IV Electrical ... · In Fall 2013, TU Berlin’s Faculty IV Electrical Engineering and Computer Science (EECS) approved a new track, which

13

AppendixD.VersionHistoryVersion Authors Date Remarks1.1 MoritzSchubotz,HolmerHemsen,VolkerMarkl 28.06.13 InitialversioninGerman1.2 MoritzSchubotz,JuanSoto,VolkerMarkl 31.07.15 TranslationintoEnglish1.3 MoritzSchubotz,JuanSoto,VolkerMarkl 16.01.16 UpdatesandRevisions2.0 RalfDetlefKutsche,VolkerMarkl,JuanSoto 09.10.17 FullRevision,newversion2

References[1]“Breakingthechains:Ondeclarativedataanalysisanddataindependenceinthebigdataera,”VolkerMarkl,PVLDB,7(13):1730–1733,2014.URL:www.vldb.org/pvldb/vol7/p1730-markl.pdf.[2]TowardsaThrivingDataEconomy:OpenData,BigData,andEcosystems(Presentation),VolkerMarkl,EuropeanCompetitivenessCouncil,March2015.URL:goo.gl/eDRSS3.[3]DIMAweb:DataSciencePublications:http://www.dima.tu-berlin.de/menue/publications/publications/.[4]ML/IDAweb:MachineLearningPublications:http://doc.ml.tu-berlin.de/publications/.[5] Completed Master’s Theses: Many Data Science Oriented, DIMA Group. URL: http://www.dima.tu-berlin.de/menue/theses/completed_mastersdiploma_theses/.[6]BerlinBigDataCenter,http://www.bbdc.berlin/home/.[7] Future Directions in Computer Science Research (Presentation: TU Berlin, Big DataWorkshop), JohnHopcroft,CornellUniversity,September2013.URL:http://www.eecs.tu-berlin.de/index.php?id=139969.[8] 50 Years of Data Science (Version 1.00), David Donoho, Stanford University, September 2015. URL:http://courses.csail.mit.edu/18.337/2015/docs/50YearsDataScience.pdf.[9]FrontiersinMassiveDataAnalysis,NationalAcademiesPress,2013.URL:http://nap.edu/18374.[10]Germany–ExcellenceinBigData,Bitkom,2016.URL:goo.gl/wUSZWv.[11]ScalingDataMininginMassivelyParallelDataflowSystems(PhDThesis),S.Schelter,November2015.[12]SpecificationandOptimizationofAnalyticalDataFlows(PhDThesis),F.Hüske,December2015.[13]Visualization-DrivenDataAggregation(PhDThesis),U.Jugel,TUBerlin,April2017.[14]ExploratoryRelationExtractioninLargeMultilingualData(PhDThesis),A.Akbik,April2016.[15]ProgrammingAbstractions,Compilation,andExecutionTechniquesforMassivelyParallelDataAnalysis(PhDThesis),S.Ewen,TUBerlin,November2014.[16] The European Data Science Salary Survey: Tools, Trends,What Pays (andWhat Doesn’t) for DataProfessionalsinEurope,JohnKing&RogerMagoulas,O’ReillyPress,2017.