Los Angeles R users group - Nov 17 2010 - Part 1

Preview:

Citation preview

Szilard Pafka – Los Angeles area R users group meeting – November 17, 2010

Software tools for data analysis: (size related to surveyed usage)

C C++ Fortran Java + libraries...

Perl Python Ruby Unix shellLisp Clojure

R Matlab Octave Maple Mathematica

SPSS Stata Statistica SAS JMP

ExcelSAS EM SPSS Clementine RapidMiner Weka Mahout

MySQL SQL Server NoSQL stores

Hadoop CUDA

support: editors code versioning cloud computing

Possible talks: December: 1. C, interfaces with R (both ways) / something else ?2. SAS: performance, R interface ready?3. RExcel

January: 1. Python & R – a comparison2. numpy, scipy3. Python vs Unix shell / NLTK / networkX

Other talks (March-)1. data storage (SQL and some noSQL), access from R2. data mining platforms3. Hadoop4. gpu5. Java6. Clojure...

Criterias for talks:

usefulness (for data analysis!) and also comparing it with R

paradigm/philosophy, main usage domain, performance, easiness to learn, quick to program, libraries

break down by:- part of the data analysis process (pre-processing, exploration (e.g. visualization), modeling etc.)- nature of data (e.g. numeric, categorical, unstructured text, networks/links etc.)- size of data

stuff that increases functionality: libraries, 3rd party extensions...

does tool X have R to X and/or X to R interface?

how these tools can be combined to support the whole process of data analysis

Recommended