View
656
Download
1
Category
Preview:
Citation preview
Szilard Pafka – Los Angeles area R users group meeting – November 17, 2010
Software tools for data analysis: (size related to surveyed usage)
C C++ Fortran Java + libraries...
Perl Python Ruby Unix shellLisp Clojure
R Matlab Octave Maple Mathematica
SPSS Stata Statistica SAS JMP
ExcelSAS EM SPSS Clementine RapidMiner Weka Mahout
MySQL SQL Server NoSQL stores
Hadoop CUDA
support: editors code versioning cloud computing
Possible talks: December: 1. C, interfaces with R (both ways) / something else ?2. SAS: performance, R interface ready?3. RExcel
January: 1. Python & R – a comparison2. numpy, scipy3. Python vs Unix shell / NLTK / networkX
Other talks (March-)1. data storage (SQL and some noSQL), access from R2. data mining platforms3. Hadoop4. gpu5. Java6. Clojure...
Criterias for talks:
usefulness (for data analysis!) and also comparing it with R
paradigm/philosophy, main usage domain, performance, easiness to learn, quick to program, libraries
break down by:- part of the data analysis process (pre-processing, exploration (e.g. visualization), modeling etc.)- nature of data (e.g. numeric, categorical, unstructured text, networks/links etc.)- size of data
stuff that increases functionality: libraries, 3rd party extensions...
does tool X have R to X and/or X to R interface?
how these tools can be combined to support the whole process of data analysis
Recommended